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Abstract 


Present  day  systems,  intelligent  or  otherwise,  are  limited  by  the  conceptualizations  of  the 
world  given  to  them  by  their  designers.  This  thesis  explores  issues  in  the  construction  of 
adaptive  systems  that  can  incrementally  reformulate  their  conceptualizations  to  achieve 
computational  efficiency  or  descriptional  adequacy.  A  detailed  account  of  a  special  case 
of  the  reformulation  problem  is  presented:  we  reconceptualize  a  knowledge  base  in  terms 
of  new  abstract  objects  and  relations  in  order  to  make  the  computation  of  a  given  class  of 
queries  more  efficient. 

Automatic  reformulation  will  not  be  possible  unless  a  reformulator  can  justify  a  shift 
in  conceptualization.  We  present  a  new  class  of  meta-theoretical  justifications  for  a  re¬ 
formulation,  called  irrelevance  explanations.  A  logical  irrelevance  explanation  proves  that 
certain  distinctions  made  in  the  formulation  are  not  necessary  for  the  computation  of  a 
given  class  of  problems.  A  computational  irrelevance  explanation  proves  that  some  dis¬ 
tinctions  are  not  useful  with  respect  to  a  given  problem  solver  for  a  given  class  of  problems. 
Inefficient  formulations  make  irrelevant  distinctions  and  the  irrelevance  principle  logically 
minimizes  a  formulation  by  removing  all  facts  and  distinctions  in  it  that  are  not  needed 
for  the  specified  goals.  The  automation  of  the  irrelevance  principle  is  demonstrated  with 
the  generation  of  abstractions  from. first  principles.  We  also  describe  the  implementation 
of  an  irrelevance  reformulator  and  onthn^  experimental  results  that  confirm  oua^  theory. 
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Chapter  1 

Introduction 


1.1  The  Need  for  Reformulation 

One  of  the  most  important  hypotheses  in  the  field  of  artificial  intelligence  (AI)  is  the 
Physical  Symbol  System  Hypothesis  [NS76]:  Computation  over  symbolic  representations 
is  both  necessary  and  sufficient  for  obtaining  intelligent  behaviour.  Over  the  last  30  years, 
research  in  AI  in  this  paradigm  has  concentrated  on  developing  sophisticated  methods 
of  controlling  search  in  the  space  defined  by  a  fixed  representation  provided  by  a  human 
programmer.  In  this  thesis,  I  investigate  the  complementary  question  of  how  to  reduce  or 
avoid  search  by  changing  representations  automatically.  This  is  the  reformulation  question 
first  introduced  in  [New65]. 

The  main  motivation  for  reformulation  stems  from  the  observation  that  intelligent 
control  of  search  cannot  always  fix  problems  caused  by  a  bad  representation.  A  dramatic 
example  that  illustrates  this  is  the  mutilated  checkerboard  problem  [McC64].  Suppose 
we  cut  off  two  diagonally  opposite  corners  of  an  8  by  8  checkerboard.  Can  we  cover  this 
mutilated  board  by  tiles  of  dimension  1  by  2?  The  naive  formulation  of  this  problem 
demonstrates  the  impossibility  of  achieving  such  a  covering  by  trying  all  possible  arrange¬ 
ments.  A  reconceptualization  of  this  problem  uses  the  fact  that  the  two  diagonally  opposite 
squares  are  of  the  same  colour  and  that  each  tile  covers  one  square  of  one  colour  and  one 
of  the  other  colour.  Since  there  are  30  squares  of  one  colour  and  32  of  the  other,  there 
is  no  possible  tiling.  This  reduces  the  solution  that  uses  exhaustive  search  to  a  simple 
counting  argument.  How  can  a  system  discover  this  formulation  of  the  problem?  Newell 
and  Simon  pose  this  question  in  their  Turing  Award  Lecture  in  1975.  They  go  on  to  add 
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that: 


The  whole  process  of  moving  from  one  representation  to  another,  and  of  dis¬ 
covering  and  evaluating  representations,  is  largely  unexplored  territory  in  the 
domain  of  problem-solving  research.  The  laws  of  quantitative  structure  govern¬ 
ing  representations  remain  to  be  discovered. 

The  challenge  above  remains  open  even  today:  the  power  of  most  AI  systems  still  lies  in 
the  representations  given  to  them  by  their  human  designers.  A  first  step  to  meeting  this 
challenge  is  the  theoretical  investigation  of,  and  the  development  of,  tools  to  incrementally 
re-design  representations — this  is  the  reformulation  problem.  Automating  incremental 
reformulations  is  very  important,  because  it  will  go  far  towards  relieving  humans  of  the 
tedium  and  inflexibility  of  programming  all  possible  conceptualizations  of  the  world  into 
AI  systems. 

Inspired  by  the  above,  the  theoretical  framework  developed  in  the  thesis  is  designed 
to  automate  the  process  of  redesigning  representations,  which  until  now  has  been  done 
entirely  by  humans.  Our  practical  interest  is  in  the  design  of  a  completely  autonomous 
robot  that  functions  well  under  resource  constraints  in  changing  environments.  Imagine 
a  robot  with  a  very  detailed  theory  of  the  world.  If  the  environment  demanded  faster 
prediction,  the  robot  should  build  an  approximate  theory  of  the  world  on  top  of  the 
more  detailed  one  to  allow  for  efficient  computation.  Both  theories  are  about  the  same 
phenomena;  the  abstract  theory  is  a  reformulation  that  makes  fewer  distinctions  and  is  thus 
much  more  efficient.  Conversely,  a  system  with  a  very  crude  theory,  say  about  substances, 
would  do  very  well  to  reformulate  by  introducing  distinctions  based  on  states  of  matter. 
A  robot  designed  to  pick  objects  off  of  an  assembly  line  should  reformulate  its  conceptual 
hierarchy  of  objects  to  correspond  to  distinctions  made  on  the  basis  of  graspability. 

In  the  mutilated  checkerboard  problem,  replacing  the  initial  set  of  distinctions  based 
on  board  positions  by  the  more  abstract  one  based  on  colour  (or  board  positions  modulo 
2)  made  the  solution  of  the  problem  “transparent”.  Reformulation  is  the  science  of  remov¬ 
ing  irrelevant  distinctions  and  introducing  necessary  ones  to  accomplish  goals  effectively 
and  efficiently.  Our  aim  in  this  thesis  is  to  uncover  general  principles  of  change  of  con¬ 
ceptualizations  which  are  notation-  as  well  as  domain-independent.  We  propose  a  general 
methodology  for  automating  reformulations  centered  around  these  principles  and  instan¬ 
tiate  it  in  the  context  of  the  specific  problem  of  automating  abstraction  reformulations  for 
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computational  efficiency1. 

But  first,  are  there  such  general  principles?  And  second,  are  they  generative — that  is, 
will  they  suggest  new  reformulations?  If  one  sets  out  to  propose  a  theory  of  reformulation 
that  could  generate  all  the  spectacular  reformulations  in  the  history  of  science,  it  would 
appear  that  the  answer,  at  least  to  the  latter  question,  is  negative.  The  reformulation 
of  the  geocentric  model  of  planetary  motion  to  the  heliocentric  one,  as  well  as  the  fre¬ 
quency  domain  characterization  of  time-dependent  behaviour  by  the  Laplace  transform, 
are  special-case  reconceptualizations  that  depended  on  then-undiscovered  properties  of 
problems  in  those  domains.  However,  these  revolutionary  transformations  account  for  a 
very  small  percentage  of  the  reformulation  phenomena  in  humans.  Most  reformulations 
are  incremental  and  by  comparison  with  the  history  of  science  examples,  mundane.  For 
instance,  we  conceptualize  a  road  as  a  line  when  planning  a  trip,  as  a  surface  when  we 
cross  it,  and  as  a  volume  when  we  attempt  to  dig  it2.  We  move  between  interval-  and 
instant-based  representations  of  time  fairly  easily.  These  granularity  [Hob85]  transforma¬ 
tions  are  fairly  routine  in  humans.  They  are  characterized  by  the  incremental  refinement 
or  coarsening  of  distinctions  made  in  the  description  of  phenomena  in  order  to  make  the 
goals  of  the  agent  easy  to  achieve.  This  thesis  attempts  to  find  invariants  in  granularity 
shifts  and  state  them  precisely  enough  to  generate  the  shifts  automatically. 

1.2  Towards  a  Formulation  of  Reformulation 

One  of  the  main  difficulties  in  posing  the  reformulation  question  comes  from  the  fact 
that  reformulation  is  an  ill-understood  phenomenon.  Also,  as  with  creativity,  there  is  a 
mysticism  associated  with  the  ability  to  reformulate.  Our  first  task  is  to  isolate  interesting 
and  useful  parts  of  the  phenomenon  that  permit  a  relatively  intuitive  knowledge- based 
solution. 

When  we  conceptualize  a  problem,  we  first  identify  the  objects,  functions  and  rela¬ 
tions  that  are  needed  to  state  it.  These  elements  represent  the  distinctions  necessary  to 
describe  the  domain  as  well  as  the  goals.  Reformulation  is  about  changing  distinctions: 
changing  the  objects,  functions  and  relations  needed  to  formulate  the  problem.  Semanti¬ 
cally,  reformulation  is  a  shift  in  conceptualization.  Conceptualizations  are  communicated 

'The  mutilated  checkerboard  problem  is  a  compelling  example  of  this. 

JThis  example  is  thanks  to  Jerry  Hobbs. 


CHAPTER  1.  INTRODUCTION 


5 


to  a  machine  by  writing  sentences  in  an  appropriate  language.  Reformulation  is  achieved 
by  changing  representations  or  encodings  of  conceptualizations. 

A  reformulation  is  correct  with  respect  to  a  set  of  goals  if  the  answers  to  the  goals 
are  preserved  across  the  conceptual  shift.  The  answer  to  the  goal  of  tiling  the  mutilated 
checkerboard  is  preserved  in  the  new  formulation;  the  reconceptualization  in  this  case  is 
a  correct  one.  Other  standard  examples  of  correct  reformulations  include  the  rectangu¬ 
lar  to  polar  coordinate  transformations,  and  the  Laplace  and  Fourier  transforms.  In  all 
these  cases,  the  basic  primitives  used  to  describe  the  problem  are  changed  as  a  result  of 
the  reformulation.  This  shift  in  conceptualization  cause'  a  reconfiguration  of  the  search 
space  of  solutions  to  a  problem.  For  example,  in  the  initial  formulation  of  the  mutilated 
checkerboard  problem,  there  were  62  distinct  objects;  the  reformulation  grouped  them 
into  30  objects  of  one  colour  and  32  of  another.  This  regrouping  causes  the  search  space 
to  shrink  from  size  262  to  1.  Reformulations  are  thus  for  a  purpose:  all  of  the  conceptual 
transformations  above  lead  to  formulations  that  are  computationally  effective  for  certain 
classes  of  queries.  A  reformulation  is  called  good  if  it  leads  to  faster  solution  of  the  goals. 

Before  defining  reconceptualizations,  we  provide  a  formal  definition  of  a  conceptual¬ 
ization. 

Definition  1  A  conceptualization  is  a  triple  (0,!F,TI)  where  O  is  a  set  of  objects  called 
the  universe  of  discourse;  T ,  called  the  functional  basis  set,  is  a  subset  of  functions  from 
On  to  O,  and  1Z,  called  the  relational  basis  set,  is  the  subset  of  relations  on  lZm ,  for  n,m 
in  the  set  of  natural  numbers. 

We  can  conceptualize  kinship  among  a  set  of  individuals  as  in  Figure  1.1.  A  conceptu¬ 
alization  makes  our  ontological  commitments  explicit.  Another  conceptualization  of  the 
kinship  problem  is  in  Figure  1.2.  The  canonical  language  Cc  of  a  conceptualization  C  has 
a  distinct  symbol  name  for  every  object,  function  and  relation  in  C. 

Definition  2  An  encoding  C  of  a  conceptualization  C  is  a  set  of  sentences  in  the  canonical 
language  Cc  such  that  C  is  one  of  the  models  of  C  under  Tarskian  interpretation. 

An  encoding  of  the  conceptualization  in  Figure  1.1  is  in  Figure  1.3. 

We  will  say  that  a  conceptualization  is  definable  in  terms  of  another  if  it  can  be 
constructed  from  the  other,  along  with  some  background  knowledge  in  the  form  of  another 
conceptualization  A. 
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Objects:  the  set  P  of  people  { A,B,C,D,E,F,G} 

Functions:  the  function  Father  from  P  to  P,  which  is  the  set 
{(A,B),(A,C),(B,D),(B,E),(C,F),(C,G)} 

Relations:  the  relation  Ancestor,  which  is  the  following  subset  of  P2, 

{(A,B),(A,C),(A,D),(A,E),(A,F),(A,G),(B,D),(B,E),(C,F),(C,G)} 

the  relation  SameFamily,  which  is  the  set  P2 

Figure  1.1:  The  conceptualization  C\ 

Objects:  the  set  P  of  people  {A,B,C,D,E,F,G} 

Relations:  the  relation  FoundingFather,  which  is  the  following  subset  of  P2, 
{(A,B),(A,C),(A,D),(A,E),(A,F),(A,G)} 

the  relation  SameFamily,  which  is  the  set  P2 

Figure  1.2:  Another  conceptualization  C2 


a 


Father(a,b) 

Father(a,c) 

Father(b.d) 

Father(b,e) 

Father(c,f) 

Father(c,g) 

Father(x,y)  =>  Ancestor(x,y) 

Ancestor(x,z)  A  Ancestor(z,y)  =>  Ancestor(x,y) 


Figure  1.3:  The  encoding  E\  oiC\ 
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Definition  3  A  conceptualization  C 2  is  a  reconceptualization  of  Cy  with  respect  to  another 
conceptualization  A  if  the  elements  of  C2  are  definable  from  Cy  and  A.  The  definitions  of 
the  elements  of  C2  in  terms  of  A  and  C\  constitute  the  articulation  theory  between  the  two 
conceptualizations. 

In  our  kinship  example,  the  conceptualization  C2  can  be  constructed  out  of  C\  by  dropping 
the  Father  and  Ancestor  relations  and  introducing  the  new  relation  FoundingFather 
which  is  a  subset  of  the  Ancestor  relation.  Since  A  =  0  here,  we  have  not  introduced 
any  new  distinctions  in  this  shift;  we  simply  coarsened  the  Ancestor  distinction.  This 
reconceptualization  is  an  abstraction.  In  the  mutilated  checkerboard  problem,  the  new 
object — set  of  red  squares — is  definable  in  terms  of  the  old  conceptualization  and  set  theory 
introduced  via  A.  When  A  is  non-empty,  the  new  conceptualization  makes  distinctions 
not  present  in  C\\  we  shall  call  such  reconceptualizations  refinements. 

Definition  4  C2  is  a  correct  reconceptualization  of  C\  with  respect  to  A  and  the  set  of 
goal  relations  G  if  G  is  definable  in  C2  only  if  it  is  definable  in  C\ . 

A  goal  relation  is  definable  in  a  conceptualization  when  it  can  be  constructed  in  that  con¬ 
ceptualization.  In  a  correct  reformulation,  G  is  preserved  exactly  across  the  conceptual 
shift;  G  is  also  already  definable  in  the  initial  conceptualization.  These  reformulations 
are  called  deductive.  The  abstraction  of  the  conceptualization  in  Figure  1.1  to  Figure  1.2 
preserves  the  SameFamily  relation;  it  is  an  instance  of  a  deductive  abstraction  reformu¬ 
lation.  A  reformulation  that  makes  an  undefinable  goal  definable,  as  in  the  introduction 
of  the  concept  odd-integer  in  the  LeX  system  [Utg86],  is  an  inductive  reformulation. 

Definition  5  A  set  of  sentences  E%  in  the  language  L  is  a  re-encoding  of  E\  in  the  same 
language  if  the  two  sets  of  sentences  have  the  same  models. 

Whereas  a  description  of  reformulation  at  the  level  of  conceptualizations  captures  cor¬ 
rectness  constraints,  it  is  too  coarse  to  model  computation.  To  describe  computational 
constraints  on  the  solution  of  the  goal,  we  use  an  encoding  of  the  conceptualization  and 
describe  its  computational  properties  with  respect  to  a  given  problem  solver.  Since  our 
chief  interest  in  this  thesis  is  in  describing  and  automating  reformulations  for  computa¬ 
tional  efficiency,  we  will  define  what  it  means  for  a  reformulation  to  satisfy  computational 
constraints. 
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Definition  6  A  reformulation  C2  of  the  conceptualization  C\  is  good  with  respect  to  a 
problem  solver  PS  and  time  and  space  bounds  S  on  the  computation  of  the  goal  u iff  g  in 
Cci  if  there  is  an  encoding  £  ofC2  that  allows  computation  of  g  using  PS  within  S.  The 
interpretation  of  g  in  C2  is  the  goal  relation  G. 

The  reformulation  problem  can  now  be  described  as  follows: 

Given 

•  the  initial  encoding  £\  of  the  conceptualization  C\ 

•  A  description  of  the  problem  solver  PS. 

•  Correctness  constraints:  specification  of  the  goal  relation  G 

•  Goodness  constraints:  time  and  space  bounds  on  the  computation 
of  the  goal  relation. 

Find 

•  a  correct  and  good  reconceptualization  C2. 

Before  we  proceed  with  a  positive  characterization  of  the  space  of  reformulations  for 
computational  efficiency,  it  is  worthwhile  to  recount  that 

Theorem  1  There  exist  problems  whose  computational  efficiency  cannot  be  improved  by 
reformulation. 

Proof:  The  Traveling  Salesman  Problem  cannot  be  made  easier  to  solve  by  a  represen¬ 
tation  shift.  The  only  way  to  improve  efficiency  is  to  change  the  solution  criterion  -  i.e., 
accept  a  satisficing  [Sim82]  solution  as  opposed  to  the  optimal  one.  □. 

The  study  of  NP-complete  problems  in  theoretical  computer  science  tells  us  that  there 
are  intrinsically  hard  problems —  no  clever  representation  or  control  strategy  can  reduce 
the  complexity  of  such  problems.  The  significance  of  NP-completeness  results  to  reformu¬ 
lation  is  analogous  to  the  significance  of  the  second  law  of  thermodynamics  for  physics 
and  engineering;  they  tell  us  what  is  reasonable  to  attempt. 

We  represent  formulations  as  partial  logical  theories  and  regard  conceptualizations  as 
their  intended  models.  We  modify  the  notion  of  a  standard  first  order  model  to  include 
not  only  the  objects  O,  but  also  the  functions  T ,  and  the  relations  H.  A  model  is  thus  the 
structure  In  this  thesis  we  examine  those  shifts  in  formulation  that  correspond 

to  reconceptualizations  of  their  models,  and  which  lead  to  faster  solution  of  a  given  goal 
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schema.  In  the  traditional  AI  manner,  we  ask  whether  we  have  in-principle  testers  and 
generators  for  partial  theories  and  their  intended  models. 

1.  Generation:  Given  a  model  (O,  T ,  H),  what  is  the  space  of  possible  reconceptualiza¬ 
tions  that  preserve  answers  to  the  goal?  Given  a  partial  theory,  what  is  the  space  of 
reencodings  of  this  theory? 

2.  Recognition:  Given  two  conceptualizations,  can  we  determine  whether  or  not  the 
goal  schema  is  definable  in  both?  Can  we  determine  whether  or  not  two  reencodings 
are  equivalent  modulo  a  goal? 

In  Chapter  2,  we  consider  the  generation  question  r.t  length.  To  give  a  flavour  of  the  style 
of  analysis,  let  us  consider  the  class  of  information  losing  or  abstraction  reformulations  for 
the  present.  The  space  of  possible  reconceptualizations  in  this  case  consists  of  all  those 
formulations  whose  conceptual  primitives  are  definable  from  the  given  conceptualization. 
For  a  finite  conceptualization  with  n0  objects,  the  number  of  definable  relations  is  0(2"°). 
This  space  is  extremely  large  even  for  small  values  for  na. 

With  respect  to  the  recognition  question  too,  we  have  the  following  mixed  bag  of 
results. 

Theorem  2  For  finite  conceptualizations  Ci  and  C%,  and  a  finite  goal  relation  G.  the 
problem  of  determining  Definable(G,Ci)  =  Definable(G,C2 )  is  decidable. 

Theorem  3  For  two  encodings  E i  and  £2,  and  a  goal  schema  g,  the  problem  of  deter¬ 
mining  E\  h  g  =  £2  h  g  is  undecidable. 

Proof:  In  general,  it  is  undecidable  whether  or  not  two  encodings  £1  and  Ej  are  equivalent 
with  respect  to  a  goal  schema  g.  The  proof  proceeds  by  reduction  to  the  halting  problem 
and  is  in  (Shm86).  □. 

The  significance  of  these  negative  results  is  that  a  highly  powerful  mechanism  for 
inventing  completely  novel  representations  is  unlikely  to  exist.  Also  as  pointed  out  by 
Simon,  in  Models  of  Discovery  Processes ,  such  a  mechanism  would  have  poor  psychological 
plausibility  because  it  would  predict  far  more  novelty  than  what  occurs. 

The  positive  conclusions  to  be  drawn  are 

1.  Only  abstraction  reformulations  can  be  automated  at  the  present,  because  we  have 
an  in-principle  generator  for  the  space  of  such  reformulations. 
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Solve(x)  «— 
Solve(x)  <— 
Solve(xAy)  <— 
Solve(-ix)  «— 


Lookup(x) 

Lookup(x«=  y)  A  Solve(y) 
Solve(x)  A  Solve(y) 

Thnot(x)  (negation  by  failure) 


Figure  1.4:  A  Simple  Depth-First  Backward-Chainer 

2.  Constraints  on  recognition  restrict  the  form  of  the  new  abstraction.  In  particular, 
we  will  constrain  all  definitions  to  be  universally  quantified  Horn  formulae. 

1.3  An  Example 

To  instantiate  the  components  of  this  problem,  we  present  am  example:  this  is  a  refor¬ 
mulation  that  is  familiar  to  computer  scientists  —  the  re-representation  of  an  equivalence 
relation  by  a  partition. 

•  The  initial  encoding  and  conceptualization: 

We  take  the  kinship  problem  conceptualized  in  Figure  1.1  and  its  encoding  in  Fig¬ 
ure  1.3.  Notice  that  the  relation  Ancestor  is  defined  to  be  the  transitive  closure  of 
Father.  The  goal  is  to  determine  whether  or  not  two  people  in  P  belong  to  the  same 
family:  two  people  belong  to  the  same  family  if  they  have  a  common  ancestor. 

•  The  Problem  Solver: 

The  problem  solver  that  works  on  this  formulation  of  the  problem  is  a  depth  first 
backward  chainer  (e.g.,  Prolog).  The  axiomatic  specification  of  the  problem  solver  is 
in  Figure  1.4.  The  simple  cost  model  of  problem  solving  actions  shown  in  Figure  1.5 
is  used  to  determine  how  expensive  the  process  of  proof  construction  is  in  terms 
of  time  and  space.  The  specifics  of  the  time  and  space  cost  functions  axe  not  very 
important;  the  methodology  proposed  here  works  for  any  well-defined  cost  model. 

•  The  correctness  and  goodness  constraints. 

The  correctness  constraint  is  the  preservation  of  the  SameFamily  relation.  The  re¬ 
formulation  and  the  present  formulation  have  to  behave  ider  Really  with  respect  to 
this  goal-schema.  The  goodness  constraints  are: 
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Proof-Height(x)  = 
Proof-  Heigh  t(x)  = 
Proof-Height(xAy)  = 
Cost(Lookup( Ground  Literal))  = 
Cost(Lookup(x<t=y))  = 


If  x  in  Formulation  then  ci 

If  x<=y  in  Formulation  then  C2  +  Proof-Height(y) 

Max(Proof-Height(x),Proof-Height(y)) 

Cl 

C2 


Figure  1.5:  A  Cost  Model  for  the  Problem  Solver 

-  The  new  formulation  should  be  able  to  solve  any  SameFamily  query  faster  in 
the  new  formulation.  In  particular,  we  want  these  queries  to  be  answered  in 
0(1)  time.  Note  that  in  the  old  formulations,  this  requires  time  proportional 
to  the  height  of  the  Father  tree. 

-  The  new  formulation  can  only  consume  as  much  space  as  the  old  formulation. 
The  answers  that  we  expect  are 

•  The  new  conceptualization; 

The  reconceptualization  shown  in  Figure  1.2  satisfies  the  correctness  constraints. 
The  articulation  theory  is: 

1.  The  objects  map  over  one  to  one. 

2.  The  new  relation  FoundingFather  is  described  intensionally  by  the  following 
definition: 

Vxy.  FoundingFather(x,  y)  =  Ancestor(x,y)  A  -i3r.Ancestor(z,  x) 

•  The  new  encoding: 

The  encoding  shown  in  Figure  1.6  meets  the  goodness  constraints  because  the  com¬ 
putation  of  SameFamily  can  be  achieved  in  constant  time  using  two  lookups  on  the 
FoundingFather  relation.  Also,  adding  this  new  relation  did  not  violate  the  space 
requirement. 

This  is  an  abstraction  reformulation  (A  =  0).  It  is  also  iso-ontic  because  it  did  not 
change  the  objects.  It  is  deductive  because  it  preserves  the  goal  relation  and  causes  it  to 
be  computed  faster. 
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a 


bed  e  f  g 


FoundingFather(a,b)  FoundingFather(a,c)  FoundingFather(a,d)  FoundingFather(a,e) 
FoundingFather(a,f) 

FoundingFath«r(a,g) 

FoundingFather(z,x)  A  FoundingFather(z,y)  =>  SameFamily(x,y) 

Figure  1.6:  The  encoding  E 2  oiC-i 

1.4  Why  Is  Reformulation  Hard? 

The  main  stumbling  block  to  automated  reformulation  is  the  fact  that  the  intended  in¬ 
terpretations  of  terms  and  symbols  is  never  represented  in  the  system  itself.  So  a  system 
has  no  logical  basis  for  changing  conceptualizations.  A  human  would  find  it  impossible  to 
reformulate  a  description  given  entirely  in  terms  of  gensysms.  Unfortunately,  our  refor- 
mulator  is  in  no  better  position  than  the  human  with  an  uninterpreted  description.  The 
knowledge  that  is  essential  for  reformulation  consists  of  what  the  referents  of  the  symbols 
are  (i.e.,  the  mapping  between  the  ontology  or  conceptualization  and  the  symbolic  encod¬ 
ing),  what  role  each  element  of  the  conceptualization  plays  in  the  computation  of  the  goal, 
as  well  as  the  space  of  possible  ways  of  perturbing  the  conceptualization.  The  reasoning 
needed  to  accomplish  reformulations  consists  of  means  of  evaluating  the  epistemic  and 
computational  consequences  of  perturbing  the  conceptualization,  and  designing  encodings 
that  correspond  to  a  given  conceptualization. 

To  do  the  above,  we  need  a  theory  of  representation  as  well  as  a  theory  of  problem 
solving.  A  theory  of  reformulation  can  be  seen  as  a  bridge  between  these  two  theories. 
This  viewpoint  on  reformulation  gives  us  another  insight  into  the  difficulties  of  automating 
it.  We  have  no  theory  of  representation,  and  a  fairly  weak  theory  of  problem  solving 
[LJP87,TF85],  so  the  construction  of  a  strong  theory  of  automating  reformulations  is 
impossible  at  this  time.  This  thesis  develops  a  theory  of  representation  centered  around  the 
idea  of  definability  of  conceptual  primitives  and  uses  existing  theories  of  problem  solving 
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to  develop  a  general- purpose  weak  method  for  generating  abstraction  reformulations  for 
computational  efficiency. 


1.5  Reformulations  from  First  Principles 

Previous  attempts  at  reformulation  have  articulated  the  required  knowledge  in  specific 
domains  in  highly  compiled  forms.  The  most  representative  of  this  class  of  research  is 
Mostow’s  PhD  thesis  [Mos81]  on  the  game  of  Hearts.  This  thesis  attempts  to  explicate 
the  origin  of  the  compiled  reformulation  rules  by  a  first-principles  analysis  that 

•  pins  down  the  interpretation  of  the  elements  in  a  conceptualization  by  providing 
properties  of  the  individual  elements  and  constraints  between  them. 

•  makes  the  relationships  between  conceptualizations  and  encodings  an  explicit  object 
to  reason  with. 

•  uses  abstract  representations  of  the  conceptualization  called  definability  structures , 
which  allow  analysis  of  the  role  of  a  conceptual  element  in  the  achievement  of  a  goal. 

•  uses  the  properties  of  the  problem  solver  and  abstract  representations  of  the  proof 
and  search  spaces  generated  on  particular  encodings  of  the  problem  to  reconfigure 
search  spaces  to  meet  computational  constraints. 

Our  approach  is  to  provide  a  unifying  framework  and  a  set  of  concepts  that  allow  declar¬ 
ative  specification  of  the  knowledge  that  is  required  for  automatic  reformulation.  Much  of 
the  knowledge  about  choice  of  conceptualization  is  left  implicit,  and  that  is  why  present 
day  systems  cannot  change  their  conceptualizations  in  a  justified  way.  For  instance,  the 
system  cannot  determine  whether  or  not  a  vocabulary  item  makes  the  distinction  it  was 
designed  for,  especially  in  the  face  of  changing  environments.  When  computational  con¬ 
straints  are  changed,  and  the  system  has  to  realign  boundaries,  the  presence  of  knowledge 
about  the  role  of  each  conceptual  element  in  the  computation  of  the  goal,  makes  it  possible 
to  evaluate  why  the  present  conceptualization  fails  to  meet  the  constraints  and  determines 
how  to  fix  it  so  as  to  achieve  them. 

The  justification-based  approach  to  reformulation  makes  the  knowledge  necessary  for 
reformulation  explicit  and  available  for  the  reformulator  to  reason  with.  This  knowledge 
is  articulated  as  an  explanation  for  a  reformulation.  There  are  constraints  on  the  nature 
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of  this  explanation:  we  construct  them  so  that  they  can  be  inverted  to  generate  refor¬ 
mulations.  An  explanation  for  a  reformulation  has  two  parts:  correctness  and  goodness 
proofs.  A  correctness  proof  depends  only  on  our  theory  of  representation  and  it  guarantees 
that  the  new  formulation  preserves  a  given  class  of  queries  with  respect  to  the  old  one. 
A  goodness  proof  shows  that  the  new  formulation  has  better  computational  properties 
than  the  old  one:  it  requires  taking  the  problem  solver  and  our  theory  of  problem  solving 
into  account.  Standard  proofs  of  correctness  and  goodness  are  non-generative.  We  use 
meta- theoretical  justifications  that  tie  the  change  in  formulation  directly  to  a  change  in 
computational  properties.  One  class  of  such  explanations  are  irrelevance  explanations.  An 
irrelevance  explanation  proves  that  certain  distinctions  in  the  formulation  are  not  logically 
necessary  for  the  solution  of  a  given  class  of  questions. 

The  crux  of  this  thesis  is  the  transformation  of  these  justifications  into  generative  pro¬ 
cedures  for  choosing  new  terms  in  order  to  improve  system  performance.  This  is  done  by 
meta-theoretic  reduction  inferences  that  modify  the  formulation  so  that  the  irrelevance 
claims  are  no  longer  true  of  the  new  formulation.  This  is  guided  by  a  local  optimisation 
principle  called  the  irrelevance  principle  whose  informal  statement  is:  minimizing  distinc¬ 
tions  with  respect  to  a  set  of  goals,  minimizes  computational  effort  in  the  solution  of  these 
goals. 

The  minimization  of  distinctions  irrelevant  to  the  goal  requires  introducing  new  terms 
that  stand  for  macro-objects  in  the  formulation  space  and  macro-actions  in  the  search 
space.  The  reduction  inferences  restructure  the  computation  using  extra-logical  criteria 
(e.g.,  minimize  redundant  computation)  that  bring  the  properties  of  the  problem  solver 
to  bear,  and  a  new  formulation  is  obtained  by  regressing  the  restructured  computation 
through  an  axiomatized  description  of  the  problem  solver. 

1.6  Claims  of  the  Thesis 

The  general  insights  about  reformulation  and  the  process  of  automating  it  are 

1.  Reformulation  is  reconceptualization;  a  change  in  the  objects,  functions  and  relations 
assumed  by  a  formulation.  This  level  of  description  of  the  phenomenon  captures  an 
important  invariant  in  the  shift. 

2.  Abstraction  reformulations  can  be  automatically  generated  by  the  irrelevance  prin¬ 
ciple  which  advises  discarding  of  distinctions  irrelevant  to  the  goals  at  hand. 
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3.  The  irrelevance  principle  has  a  computational  justification:  it  leads  to  the  minimiza¬ 
tion  of  unnecessary  computation. 

4.  Meta-theoretic  claims  of  irrelevance  about  a  formulation  are  key  to  the  automatic 
generation  of  abstraction  reformulations. 

5.  Meta- theoretic  reduction  inferences  automatically  eliminate  irrelevance  in  a  formu¬ 
lation. 

The  specific  contributions  of  the  thesis  are 

1.  The  development  of  the  calculus  of  irrelevance  that  allows  stating,  verifying  and 
generating  justifications  for  abstraction  reformulations. 

2.  The  formulation  of  the  irrelevance  principle  and  a  logical  analysis  of  the  meta- 
theoretic  reduction  of  a  formulation  by  irrelevance  claims. 

3.  The  design  of  algorithms  for  reduction  that  are  graph- theoretic  compilations  of  the 
reduction  process. 

4.  The  design  of  abstract  representations  for  representations,  called  definability  struc¬ 
tures  and  the  definedness  graph ,  and  efficient  algorithms  that  operate  on  them. 

5.  Methods  for  the  complete  automation  of  the  class  of  abstractions  called  elimination 
of  intermediates. 

1 . 7  Perspectives 

1.  Knowledge  Representation:  This  thesis  provides  a  semantic  account  of  efficient  lan¬ 
guage  by  describing  reformulations  for  computational  efficiency  at  the  level  of  con¬ 
ceptualizations.  The  method  of  irrelevance  minimization  gives  a  generative  account 
of  how  computational  pressures  shape  representations.  A  new  structure  for  describ¬ 
ing  representations,  called  definability  lattices ,  is  also  introduced. 

2.  Problem  Solving:  We  introduce  a  new  kind  of  meta- theoretic  reasoning  called  irrele¬ 
vance  reasoning  that  speeds  up  computation  of  given  queries  in  a  logical  formulation 
of  a  problem.  It  makes  use  of  information  about  the  queries,  and  the  problem- solver 
that  works  on  the  queries,  to  determine  what  aspects  of  the  formulation  can  be 
abstracted  to  make  the  computation  efficient. 
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3.  Machine  Learning:  The  origin  of  new  terms  for  computational  efficiency,  or  skill- 
acquisition  style  of  learning,  is  one  of  the  open  problems  in  machine  learning.  This 
thesis  proposes  an  analytical  solution  to  the  problem  that  uses  knowledge  about  the 
problem  and  the  problem  solver,  as  well  as  the  principle  of  irrelevance  to  automati¬ 
cally  acquire  new  vocabulary  to  improve  performance. 

4.  Knowledge  Engineering:  We  introduce  the  notion  of  irrelevance  that  is  essential  for 
knowledge  base  designers.  Many  of  the  justifications  for  choice  of  conceptualization 
can  be  formulated  in  terms  of  it.  By  explicitly  recording  these  justifications,  the 
designer  can  give  the  system  the  ability  to  modify  its  conceptualization  automatically 
in  the  face  of  changing  environments. 

1.8  Reading  Guide 

This  dissertation  presents  the  thesis  that  representations  are  not  arbitrarily  chosen,  rather 
they  are  the  result  of  principles  of  computational  economy.  The  document  is  organized  as 
follows.  Chapter  2  analyzes  the  reformulation  problem  and  proposes  a  methodology  for 
automating  it.  It  also  presents  the  irrelevance  principle  that  underlies  the  generation  of 
abstraction  reformulations.  The  theoretical  apparatus  necessary  for  applying  this  principle 
is  developed  in  Chapter  3.  Chapter  4  shows  how  this  principle  can  explain  the  formation 
of  abstraction  reformulations.  While  Chapters  3  and  4  provide  an  espistemologically  ad¬ 
equate  solution  to  automating  reformulation,  Chapter  5  addresses  the  heuristic  adequacy 
of  our  solution.  It  describes  special  cases  of  the  theory  that  axe  mechanizable  and  that 
cover  a  large  percentage  of  abstraction  reformulations.  It  also  presents  some  results  of 
empirical  tests  that  confirm  our  theory.  Chapter  6  restates  the  main  results  of  the  thesis 
and  contains  a  discussion  of  their  significance  in  designing  representations.  We  conclude 
with  an  evaluation  of  the  strengths  and  weaknesses  of  our  approach  and  a  proposal  for 
future  research  on  other  kinds  of  reformulation. 

There  are  some  notational  conventions  we  use  in  this  thesis.  We  will  distinguish  things 
from  symbols  that  represent  them.  Thus,  elements  of  encodings  will  be  printed  in  sans- 
serif  while  the  elements  of  conceptualizations  will  be  in  bold  face.  As  an  example,  the 
Ancestor  relation  will  be  represented  by  the  relation  symbol  Ancestor.  For  propositional 
letters,  we  will  use  lower-case  letters.  For  first-order  formulas,  we  will  distinguish  between 
constants  and  variables.  Constants  will  be  in  lower-case,  variables  in  CAPITAL  letters. 
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The  standard  connectives  (v,  A,  =)  and  the  quantifiers  (V,  3)  will  be  used. 


Chapter  2 


The  Reformulation  Problem 


What  can  we  do  when  we  can’t  solve  a  problem?  We  can  try  to  find  a  new 
way  to  look  at  it,  to  describe  it  in  different  terms.  Reformulation  is  the  most 
powerful  way  to  attempt  to  escape  from  what  seems  to  be  a  hopeless  situation. 

—  Marvin  Minsky,  Society  of  Mind,  page  141. 

2.1  Introduction 

Scientific  advances  involve  not  only  solving  problems  but  posing  them  as  well.  Asking  the 
right  questions  is  a  creative  act  and  solving  them  is  a  relatively  routine  activity,  once  the 
questions  are  correctly  identified.  This  chapter  attempts  to  describe  the  reformulation 
phenomenon  as  precisely  as  possible  and  poses  the  problem  of  automating  it.  Previous 
approaches  to  the  problem  axe  critically  examined  and  the  solution  methodology  proposed 
in  this  thesis  is  presented. 

A  cognitive  phenomenon  like  reformulation  can  be  explained  at  three  different  levels 
[Pyl84]:  the  intensional,  semantics  or  knowledge  level;  the  symbolic,  syntactic  or  func¬ 
tional  level;  and  the  physical  or  biological  level.  An  explanation  of  a  phenomenon  at  the 
intensional  level  appeals  to  the  semantic  content  of  representations  in  an  agent.  The  reg¬ 
ularities  of  the  phenomenon  are  captured  by  principles  that  mention  the  agent’s  goals  and 
beliefs.  In  this  thesis,  we  give  a  semantic  account  of  reformulation  as  changing  distinc¬ 
tions  an  agent  makes  in  the  world,  in  order  to  achieve  its  goals  effectively.  The  symbolic 
account  explains  how  these  goals  and  beliefs  are  encoded  and  presents  the  algorithms  by 
which  the  behaviour  is  achieved.  Reformulation  is  realized  by  changing  encodings:  the 
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symbolic  account  of  reformulation  is  thus  a  syntactic  one  of  theory  change.  The  physical 
account  explains  how  the  symbol  systems  are  actually  constructed.  A  physical  account  of 
reformulation  is  not  provided  in  this  thesis,  it  would  be  useful  for  building  a  reformulation 
device. 

2.2  The  Semantics  of  Reformulation:  Conceptual  Change 

Reformulation  is  reconceptualization.  When  one  conceptualizes  a  problem,  one  determines 
what  objects,  functions  and  relations  are  needed  for  the  purpose.  The  items  in  a  concep¬ 
tualization  represent  the  distinctions  for  stating  and  solving  the  problem.  Reformulation 
changes  distinctions:  it  restructures  our  knowledge  of  the  world  in  terms  of  new  conceptual 
elements. 

Traditional  accounts  of  reformulation[Kor80,Mos81,Mar76a,New65,Len82]  have  only 
provided  syntactic  methods  without  the  accompanying  semantics.  This  thesis  gives  mean¬ 
ing  to  shifts  in  formulation  by  examining  the  shift  in  conceptualization  it  entails.  By 
equating  reformulation  with  reconceptualization  we  provide  a  clean  Type-1  [Mar76b]  the¬ 
ory  of  reformulation;  i.e.,  we  separate  an  account  of  what  reformulation  is,  from  how  to 
do  it.  Before  we  describe  what  reconceptualizations  are,  we  begin  with  some  intuitions 
about,  and  a  formalization  of,  conceptualizations. 

2.2.1  Conceptualizations 

A  conceptualization[GN87]  is  a  model  of  the  world,  it  consists  of  the  objects,  functions  and 
relations  that  are  of  interest.  For  example,  we  can  conceptualize  kinship  among  a  given 
set  of  individuals  as  in  Figure  1.1.  Another  conceptualization  of  the  kinship  problem  that 
preserves  the  SameFamily  relation  is  shown  in  Figure  1.2.  In  the  new  conceptualization, 
the  distinction  between  Father  and  Ancestor  is  removed  and  replaced  by  the  maximal 
ancestor  or  the  FoundingFather  relation.  A  full  adder  can  be  conceptualized  either 
at  the  gate  level  as  in  Figure  2.1  or  as  a  unit,  as  in  Figure  2.2.  More  examples  of 
conceptualizations  are  found  in  [GN87].  Note  that  the  items  in  bold  face  are  relations 
in  the  world.  This  is  our  notational  convention  for  the  meta-language  whose  domain  of 
discourse  is  the  elements  in  a  conceptualization. 

We  repeat  the  definition  of  a  conceptualization  from  Chapter  1  for  convenience. 


Definition  7  A  conceptualization  is  a  triple  (O,  !F,H)  where  O  is  a  set  of  objects  called 
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Objects: 

Functions: 

Relations: 


Objecta: 

Functions: 

Relations: 


the  gates:  XI,  X2,  Al,  A2,  Ol 
the  ports:  a,  b,  c,  d,  e,  f,  sum,  carry 
the  values:  0,  1 

conn, and,  or,  xor,  value. 

Figure  2.1:  A  conceptualization  of  a  Full  Adder 


the  ports:  Ini,  In2,  Cin,  Cout,  Sum 
and,  or,  xor,  value 

Figure  2.2:  Another  conceptualization  of  a  Full  Adder 
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A 

B 

E 

C 

F 

Figure  2.3:  Defining  Clear  in  terms  of  On 


the  universe  of  discourse;  ? ,  called  the  functional  basis  set ,  is  a  subset  of  functions  from 
On  to  O,  and  TZ,  called  the  relational  basis  set,  is  the  subset  of  relations  on  lZm ,  for  n,m 
in  the  set  of  natural  numbers. 

A  conceptualization  is  a  structure  very  similar  to  a  Herbrand  model  -  the  only  difference  is 
that  a  Herbrand  model  consists  of  all  the  functions  and  relations  defined  on  the  Herbrand 
universe  (0),  whereas  we  want  to  select  particular  functions  and  relations  to  be  members 
of  a  conceptualization. 

There  is  an  interesting  relationship  between  the  elements  in  the  two  conceptualizations  of 
the  kinship  problem.  We  can  construct  the  relation  FoundingFather  out  of  the  Ances¬ 
tor  relation  by  using  standard  relational  operations  [1X1182] .  However,  we  cannot  recon¬ 
struct  the  Ancestor  relation  out  of  the  FoundingFather  relation.  The  constructibility 
of  a  conceptual  element  from  a  set  of  such  elements  can  be  made  precise  using  the  logical 
notion  of  definability  [End66]. 

Definition  8  A  conceptual  element  c  is  definable  in  terms  of  a  set  C  of  objects,  functions 
and  relations  Xi,Xj, . . . ,  x„,  •written  as  Definable(c,C)  if  there  exists  a  first-order  formula 
<t>  with  non-logical  symbols  7,  <rj ,  Oi, . . . ,  <r„,  for  which  1 )  there  is  a  model  of  4>  that  maps 
Oi’s  to  the  x; ' s  and  7  to  c.  2)  Every  model  of  <p  that  maps  a;  ’s  to  X{ ’s  also  maps  7  to  c. 
<j>  is  called  the  defining  formula  for  c  in  {xj ,  xj, . . .  x„). 

Notice  that  Definition  8  expresses  the  construction  of  c  in  terms  of  the  elements  of 
C  intensionally  as  a  formula  in  a  language.  Here  are  a  few  examples  of  the  use  of  this 
definition. 
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Figure  2.3  shows  a  conceptualization  of  the  blocks  world.  It  consists  of  the  blocks 
shown,  and  two  relations  Clear  whose  extension  is  the  set  of  blocks  that  have  no  blocks 
on  top  of  them,  and  the  On  relation  that  consists  of  block  pairs  (x,y)  where  x  is  on  top 
of  y.  On  =  {(A,B),  (B,C),  (E,F)}  and  Clear={A,E).  The  unary  relation  Clear  is 
definable  in  terms  of  the  binary  On  relation.  The  defining  formula  is 

Clear(z)  =  ->3 y.  0 n(y,  x) 

The  symbol  On  is  interpreted  to  be  the  On  relation.  Every  model  of  4>  that  maps  On  to 
On  will  have  to  map  Clear  to  Clear.  Note  however,  that  On  cannot  be  defined  in  terms  of 
Clear.  This  is  because  for  a  fixed  model  for  Clear,  the  sentence  ->3y.  On(y,x)  constrains 
the  set  of  possible  models  for  On,  but  does  not  uniquely  determine  it. 

In  our  kinship  example,  the  FoundingFather  relation  is  definable  in  terms  of  Ancestor, 
because  we  can  construct  a  definition 

Vary.  FoundingFather(x,  y)  =  Ancestor(x,  y)  A  -i3x.  Ancestor(z,x) 

In  all  models  that  Ancestor  refers  to  the  Ancestor  relation,  the  symbol  FoundingFather 
is  mapped  to  the  FoundingFather  relation. 

Both  the  examples  above  are  instances  of  defining  a  new  relation  in  terms  of  other  given 
relations.  We  now  address  the  issue  of  defining  new  objects.  One  way  to  define  a  new 
object  is  to  reify  an  existing  relation.  For  instance,  we  can  define  the  object  red  to  stand 
for  the  predicate  Red  by  introducing  a  new  function  from  predicates  to  objects  called  the 
denotation  function  [McC79]  and  write 

denotes(  red,  Red) 

.  This  would  constitute  the  defining  formula  for  the  object  red  in  terms  of  the  relation 

Red. 

Definition  8  uses  first-order  definability ,  because  the  formula  ^  is  a  first-order  well- 
formed  formula.  In  order  to  define  the  SetofMissionaries  relation  from  a  conceptual¬ 
ization  that  contains  individual  missionaries  in  the  Missionaries  relation,  we  need  to  be 
able  to  define  arbitrary  subsets  of  Missionaries.  The  defining  formula  is  then 


SetofMissionaries  =  {x  |  Missionary^)} 
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.  To  see  why  this  is  not  a  pure  first-order  definition  in  the  sense  required  by  Definition  8, 
notice  that  the  above  definition  can  be  rewritten  as 

member(x,  Setof  Missionaries)  =  Missionary(x) 

There  is  no  set  of  first-order  axioms  about  member  that  makes  standard  set-theoretic 
reasoning  reproducible  inside  first-order  logic.  We  need  to  enrich  our  notion  of  a  defining 
formula  in  Definition  8  to  allow  set-theoretic  constructions.  Then  we  can  construct  a 
richer  class  of  conceptual  elements  from  a  given  set. 

Here’s  another  example  of  a  concept  that  is  undefinable  in  a  pure  first-order  system. 
The  relation  Ancestor  is  the  transitive  closure  of  the  Father  relation.  Unfortunately, 
there  is  no  first-order  well-formed  formula  that  defines  the  Ancestor  relation  in  terms  of 
the  Father  relation.  The  following 

Vxy.Ancestor(x,  y)  =  Father(x,  y)  V  (3z.  Ancestor(x,  z)  A  Ancestor(z,  y)) 

constrains  the  Ancestor  relation  to  be  atleast  the  transitive  closure  of  Father,  but  does 
not  fix  it  to  be  just  the  transitive  closure.  If  this  definition  were  interpreted  within  the 
semantics  of  Prolog  (minimal  Herbrand  models),  then  the  above  sentence  would  constitute 
a  valid  definition  for  Ancestor  under  the  conditions  of  Definition  8. 

We  can  extend  Definition  8  to  cover  the  definability  of  an  entire  conceptualization  in  terms 
of  another. 

Definition  9  A  conceptualization  C*  is  definable  in  terms  of  a  conceptualization  C\ ,  writ¬ 
ten  as  Definable- C(Cj , C2 ) ,  i/Vc  6  Cj.  Definable(c,Ci )•  The  set  of  defining  formulas 
constitutes  the  articulation  theory  between  the  two  conceptualizations. 

The  conceptualization  in  Figure  1.2  is  definable  from  that  in  Figure  1.1.  The  articulation 
theory  is  the  definition  of  the  FoundingFather  relation  given  before. 

One  of  the  problems  with  Definition  8  and  thus  with  Definition  9  (since  it  hinges  on 
Definition  8)  is  that  it  imposes  almost  no  constraints  on  the  nature  of  the  defining  for¬ 
mula  <t>.  For  finite  conceptualizations,  and  a  conceptual  element  with  a  finite  extension, 
4>  could  simply  be  the  trivial  disjunction  of  the  elements  of  the  extension  of  that  con¬ 
ceptual  element.  In  the  case  of  the  kinship  problem,  we  could  as  well  have  defined  the 
FoundingFather  relation  directly  as  its  extension  over  a  certain  universe  of  people.  The 
problem  with  such  a  ^  is  that  it  needs  to  change  when  the  universe  of  discourse  changes. 
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Notice  that  the  universally  quantified  defining  formula  provided  before  for  the  Found- 
ingFather  relation  is  independent  of  the  particulars  of  the  universe  of  discourse  of  the 
conceptualization. 

To  get  around  this  problem,  we  define  the  notion  of  a  conceptual  scheme  that  abstracts 
away  particular  individuals  in  a  conceptualization,  and  preserves  the  functional  and  re¬ 
lational  structure.  We  can  no  longer  describe  such  a  conceptualization  as  in  Figure  1.1, 
so  we  use  an  intensional  description  via  definability  claims.  For  the  conceptualization  in 
Fig  1.1,  we  have  the  following  scheme: 

Definable( Ancestor,  {Father,  Ancestor}) 

De finable( SameFamily ,  {Ancestor} ) 

Definition  10  A  conceptual  scheme  CS  is  a  pair  (S,  T>)  where  S  is  a  set  of  relations  and 
functions,  and  V  is  a  set  of  definability  claims  of  the  form  Definable (a,b)  where  a  £  S  and 
b  C  S. 

Since  the  identity  of  the  defining  formula  <j>  is  abstracted  away  in  the  relation  Defin¬ 
able,  the  set  of  definability  claims  that  constitute  a  conceptual  scheme  rarely  pick  out 
a  unique  conceptualization  (i.e.  they  axe  not  categorical,  except  in  trivial  cases).  If  on 
the  other  hand,  we  maintain  #  along  with  the  Definable  relation,  we  essentially  keep  a 
recipe  for  constructing  parts  of  a  conceptualization  from  other  parts.  For  instance  Fa¬ 
ther  is  a  base  relation  in  the  conceptualization  C f.  Given  Father  and  the  fact  that 
De/inaile-d>(Father,  {Ancestor},  Father(x.y)  V3z.  Ancestor(x.z)  A  Ancestor(z.y)),  we  can 
construct  the  extension  of  the  Ancestor  relation.  Definable-#  is  important  because  it 
succintly  captures  the  construction  of  a  conceptualization  from  its  base  objects,  functions 
and  relations.  Definable-C ,  on  the  other  hand,  compactly  describes  the  construction  of  one 
conceptualization  from  another.  We  define  Definable-C '-#  as  the  extension  of  Definable-C 
that  maintains  the  articulation  theory  between  the  two  conceptualizations. 

2.2.2  Definability  Analysis 

Since  definability  is  a  key  notion  in  our  analysis  of  conceptualizations  and  reconceptual¬ 
izations,  we  study  its  properties  in  detail. 

Observation  1  Definable-C  is  a  reflexive  relation. 
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This  follows  directly  from  the  definition  of  Definable-C.  Every  conceptualization  can  be 
constructed  from  itself  by  the  identity  map. 

Observation  2  Definable-C  is  a  transitive  relation. 

If  we  can  construct  C2  from  Ci  and  the  articulation  theory  is  A\,  and  if  C3  can  be  con¬ 
structed  from  C2  using  the  set  of  definitions  A2,  then  as  long  as  the  names  of  the  elements 
in  the  three  conceptualizations  are  standardized  apart,  we  can  construct  C3  from  C\  by 
composing  the  articulation  theories  in  sequence. 

Observation  3  Definable-C  is  not  symmetric. 

If  C2  is  created  from  C\  by  losing  information,  then  it  is  impossible  to  recover  that  in¬ 
formation.  This  is  the  case  in  the  abstraction  of  the  conceptual  primitive  FoundingFather 
from  the  relation  Ancestor.  Unfortunately,  Definable-C  is  not  anti-symmetric  either!  If 
Ci  is  definable  in  terms  of  C2,  and  C2  is  also  definable  in  terms  of  C\,  then  C\  and  C2 
are  only  isomorphic  and  not  identical.  An  example  is  the  (r,0)  conceptualization  and  the 
(x,y)  conceptualization  of  the  real  plane  :  each  conceptualization  can  be  constructed  from 
the  other,  but  they  are  not  identical.  Observation  3  prevents  Definable-C  from  defining  a 
lattice  structure  on  conceptualizations. 

Theorem  4  Definable-C  is  a  pre-order. 

Theorem  4  makes  it  easy  for  us  to  generate  the  search  space  of  possible  primitives  for 
a  conceptualization.  Figure  2.2.2  introduces  the  definability  structure  which  is  a  set  of 
conceptualizations  ordered  by  the  Definable-C  relation.  From  a  given  conceptualization  C 
in  a  definability  structure  5,  we  can  construct  the  upper  and  lower  sets  of  C  both  of  which 
are  subsets  of  S. 

Upper(C,S)  =  {c  |  Definable-C(C,c)  A  c  G  5} 

Lower(C,S)  =  {e  |  Definable-C{c,C)  A  c  €  5} 

The  set  Upper  consists  of  conceptualizations  that  make  finer  distinctions  than  C  and 
can  thus  be  thought  of  as  refinements  of  C.  The  set  Lower  contains  conceptualizations 
that  make  coarser  distinctions  than  C,  alternatively  construed  of  as  the  abstractions  of  C. 
Both  Upper  and  Lower  are  partially  ordered  by  the  relation  Definable-C.  Note  that 

Upper(C,S)C\  Lower(C,S)  =  C  U  {c  |  Definable-C(c,C)  A  Definable-C (C ,c)  A  c  6  5} 
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We  can  define  upper  and  lower  bounds  of  two  conceptualizations  in  S  because  of  the 
existence  of  the  pre-order  Definable-C.  A  conceptualization  C\  is  a  lower  bound  of  the 
conceptualizations  C\  and  C2  if  it  is  definable  from  both  of  them. 

Lower  Jiound(Ci,Ci)  =  Cj  such,  that  Definable-C{Ci,C\)  A  Dc/t'na6/e-C(C/,C2) 

Similarly,  an  upper  bound  of  two  elements  in  the  lattice  is  a  conceptualization  that  defines 
them  both. 

Upper  J3ound(Ci,C2)  =  Cu  such  that  Definable-C{C\,Cu)  A  Definable-C{C-iXu) 

We  can  define  greatest  lower  bounds  and  least  upper  bounds  in  the  usual  way.  In¬ 
formally,  a  lower  bound  of  two  conceptualizations  includes  at  most  the  distinctions  made 
in  the  intersection  of  the  conceptualizations.  An  upper  bound  includes  at  least  the  dis¬ 
tinctions  made  in  the  union  of  the  conceptualizations.  A  greatest  lower  bound  of  two 
conceptualizations  is  an  interesting  structure,  because  it  is  one  that  preserves  just  the 
distinctions  common  to  both  conceptualizations.  So  the  search  for  a  minimal  weakening 
of  two  conceptualizations  that  preserves  some  relations  of  interest,  is  simply  the  search  for 
the  greatest  lower  bound. 

The  definability  structure  is  bounded  because  we  only  consider  finite  conceptualizations; 
the  topmost  node  of  the  lattice  represents  the  finest  grain  of  distinctions  we  can  ever 
make  in  the  world,  and  the  lowest  node  is  the  empty  conceptualization:  it  is  definable  in 
terms  of  every  other  node  in  the  structure.  We  now  present  an  example  of  how  portions 
of  this  structure  can  be  generated  from  a  given  conceptual  scheme.  We  start  with  a 
schema  consisting  of  the  goal  relation  SameF&mily,  and  we  incrementally  introduce  finer 
distinctions.  The  generating  formula  for  the  first  layer  of  nodes  “above”  this  consist 
of  two  relation-schemas:  x  and  SameFhmily  where  Z?e/tnohte(x,{SameFamily}).  The 
relations  that  satisfy  this  constraint  are  what  we  already  know  as  Father,  Ancestor, 
and  FoundingFather.  These  x’s  are  called  interpolants  of  SameFamily.  We  can  build 
further  nodes  by  constructing  interpolants  of  the  newly  introduced  relations.  This  is  shown 
in  Figure  2.4. 

2.2.3  Reconceptualizations 

Definability  is  an  important  tool  for  analyzing  the  relationship  between  different  concep¬ 
tualizations  because  it  succintly  describes  how  one  conceptualization  can  be  constructed 
from  another.  We  can  now  define  what  we  mean  by  a  conceptual  shift.1 

‘This  is  s  restatement  of  Definition  3  of  Chapter  1. 
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Figure  2.5:  Conceptualizations  and  the  World 


Definition  11  A  conceptualization  C2  is  a  reconceptualization  ofCi  with  respect  to  some 
background  conceptualization  A,  if  the  elements  of  C2  are  definable  from  Cx  and  A. 

Reformulation  is  the  science  of  introducing  and  removing  distinctions.  It  is  the  ability 
to  cut  the  world  up  into  the  right  pieces  for  the  current  goals.  The  items  in  a  concep¬ 
tualization  denote  distinctions  in  the  world.  The  reconceptualization  is  also  about  this 
world.  How  do  we  capture  this  notion?  The  denotation  relation  between  the  world  and 
the  conceptualization  as  shown  in  Figure  2.5  is  in  the  head  of  the  modeler.  To  get  at  the 
fact  that  both  conceptualizations  are  about  the  same  world,  we  rely  on  the  integrity  of 
the  denotation  relation  between  C\  and  the  world,  and  then  construct  C%  out  of  C\.  This 
is  why  definability  is  an  important  constraint  in  our  definition  of  reconceptualization.  If 
A  is  the  null  set,  then  the  new  conceptualization  is  guaranteed  to  be  about  the  same 
world3,  since  definable  relations  and  objects  can  only  make  existing  'distinctions  coarser 
and  cannot  introduce  new  ones.  Definability  captures  the  notion  of  retiling  a  given  W  or 
an  abstraction  of  it.  If  we  wish  to  introduce  distinctions,  we  enrich  W  explicitly  through 
the  background  knowledge  A.  For  instance,  the  introduction  of  the  sets  of  missionaries 
and  cannibals  is  done  by  the  introduction  of  set  theory  through  A. 

We  notice  that  according  to  Definition  5,  the  conceptualization  C2  in  Figure  1.2  is  a 
reconceptualization  of  C\  in  Figure  1.1,  with  A  =  0.  This  reconceptualization  did  not 
change  the  objects  in  the  domain  of  discourse,  only  the  relations  on  them.  It  is  called  an 


*or  u  mack  about  the  given  world  m  the  original  conceptualization  wai. 
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iso-ontic  reconceptualization.  Since  it  is  the  case  that 

-i  Z)e/ina6ie (^Ancestor, {FoundingFat her} ), 

4 

the  reconceptualization  has  lost  information;  we  call  it  an  abstraction.  An  example  of 
a  reconceptualization  that  involves  changing  the  objects  is  the  classic  missionaries  and 
cannibals  puzzle,  first  analyzed  in  [Ama68].  The  initial  conceptualization  of  the  problem  < 

consists  of  six  individuals:  Larry,  Curly  and  Moe  (the  missionaries),  and  Huey,  Dewey 
and  Louie  (the  cannibals).  The  reconceptualized  problem  introduces  two  new  objects: 
the  set  of  missionaries  and  the  set  of  cannibals.  The  introduction  of  the  sets  allows  us  to 
change  the  granularity  of  the  individual- based  boat  load  and  unload  actions  into  set-based 
load  and  unload  actions.  This  then  allows  for  efficient  computation  of  the  schedule  for 
transporting  people  across  the  river. 

Let  us  study  some  more  cases  of  conceptual  change  and  see  whether  they  fit  our  defini¬ 
tion.  The  change  from  the  geocentric  into  the  heliocentric  conceptualization  of  planetary 
motion  was  an  iso-ontic  reconceptualization3  that  changed  the  relations  among  the  mem¬ 
bers  of  our  solar  system.  Central  to  the  geocentric  theory  was  the  relation  Orbits-Earth, 
whose  extension  included  the  sun  and  all  the  planets  except  the  earth.  The  heliocentric 
theory  grouped  the  objects  somewhat  differently,  it  posited  the  existence  of  the  Orbits- 
Sun  relation  whose  extension  included  all  the  planets.  Note  that  being  a  finite  relation, 
Orbits-Sun  is  definable  in  terms  of  Orbits-Earth  and  equality. 

Vx.  Orbits-Sun(x)  s  [x  ^  sun  A  Orbits- Earth(x)  ]  v  x  =  earth 

The  same  phenomena  (the  astronomical  data)  was  now  explained  in  a  simpler  way  in  the 
helioceqjtric  theory  with  the  new  conceptual  primitive  Orbits-Sun.  Conceptual  change 
(change  in  the  primitives  to  describe  the  world)  and  theory  revision  (change  in  what 
we  say  about  the  world)  often  go  hand-in-hand  and  it  is  difficult  to  separate  the  two 
phenomena.  There  are  atleast  three  ways  in  which  different  conceptualizations  of  a  given 
world  differ[Car87]. 

1.  Individual  concepts  in  the  system 

A  pure  case  of  conceptual  change  occurs  in  the  Fourier  and  Laplace  transformations 
as  well  as  rectangular  to  polar  coordinate  transforms.  Reformulating  a  description 

*  we  shall  treat  epicycle*  u  relation*  and  not  objects,  for  this  purpose. 
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of  a  2-D  scene  expressed  in  rectangular  coordinates  into  polar  coordinates,  preserves 
the  content  of  the  intial  description  exactly.  The  distinctions  made  by  the  two 
conceptualizations  are  different,  they  tile  the  same  content  space  in  different  ways. 

2.  The  domain  of  the  phenomena  accounted  for 

An  example  from  physics  is  the  wave  and  particle  conceptualizations  of  light.  Each 
conceptualization  accounted  for  phenomena  the  other  couldn’t.  A  more  mundane 
case  is  the  difference  in  the  conceptualization  of  a  leaf  by  a  botanist  and  by  a  layman. 
The  botanist  makes  many  more  distinctions  than  the  layman  because  he  needs  finer 
distinctions  to  be  able  to  make  predictions  of  interest  to  him.  Yet  another  case  from 
the  history  of  science  is  the  shift  from  the  Aristotelian  to  the  Galilean  theory  of 
motion:  Aristotle’s  theory  included  all  changes  over  time:  growth,  decay,  movement, 
etc,  whereas  Galileo  specialized  it  to  cover  movement  alone  [Kuh87]. 

3.  The  nature  of  the  explanation 

The  shift  to  the  heliocentric  theory  made  the  explanations  of  astronomical  observa¬ 
tions  simpler.  This  is  a  case  where  the  nature  of  explanations  changed  as  a  result  of 
the  conceptual  shift.  In  our  kinship  example,  an  encoding  of  the  reconceptualization 
in  terms  of  FoundingFather  makes  proofs  of  SameFamily  propositions  shorter.  The 
novice-expert  conceptual  shifts  studied  in  detail  by  cognitive  scientists  [CME82]  in¬ 
dicate  that  experts  use  relations  among  objects  that  are  definable  in  terms  of  those 
held  by  novices.  However,  possession  of  these  concepts  allow  them  to  interpret  a 
problem  situation  better  and  state  strategies  in  a  much  more  perspicuous  fashion 
than  is  allowed  by  the  novice’s  ontology.  v 

It  should  be  emphasized  that  identifying  the  ontological  commitments  of  a  theory 
(especially  scientific  ones)  is  a  non-trivial  endeavour.  Our  definition  of  a  conceptual  change 
allows  us  compare  two  given  ontologies  and  assess  whether  they  are  about  the  same  reality 
by  determining  their  interdefinability. 

We  shall  call  a  reconceptualization  correct  with  respect  to  a  set  of  goal  relations,  if  the 
goals  are  preserved  in  the  new  conceptualization.  More  formally, 

Definition  12  C*  is  a  correct  reconceptualization  of  C\  with  respect  to  the  set  of  goal 
relations  G,  if  Definable (G,C\)  if  and  only  if  Definable 

The  kinship  reformulation  is  correct  with  respect  to  the  goal  relation  SameFamily 
and  incorrect  with  respect  to  the  goal  relation  Father. 
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The  search  for  a  correct  reconceptualization  in  the  definability  structure  S  is  the  search 
from  the  given  C  to  one  where  the  goal  G  is  still  definable.  A  principle  of  economy  that 
dictates  the  making  of  the  fewest  distinctions  would  allow  us  to  pick  the  lowest  node  in  the 
definability  graph  tuat  has  G  definable  in  it.  The  role  of  the  background  conceptualization 
A  is  to  help  navigate  in  the  space  of  alternative  conceptualizations  5,  either  by  directed 
addition  of  distinctions  in  traversing  Upper(C,S),  or  the  directed  losing  of  distinctions 
while  traversing  Lower(C,S). 

We  can  distinguish  several  types  of  reconceptualizations 

1.  New  conceptualization  definable  in  terms  of  the  old  one 

This  class  includes  abstraction  reformulations  like  the  kinship  example  and  the  full 
adder  example.  There  are  two  basic  types  of  operations  needed  to  generate  them 
from  a  given  conceptualization:  dropping  conceptual  elements  (e.g.,  the  removal  of 
the  relation  Father),  and  adding  definable  compounds  (e.g.,  the  addition  of  the 
relation  FoundingFather).  This  class  of  reconceptualizations  does  not  permit  the 
solution  of  goals  that  were  unsolvable  in  the  old  conceptualization.  Often  they  set 
the  stage  for  encoding  shifts  that  permit  faster  solution  of  previously  soluble  goals. 

2.  New  conceptualization  consistent  with  the  old  one 

The  operation  that  generates  conceptualizations  in  this  class  is  the  addition  of  new 
conceptual  elements  that  are  not  definable  in  terms  of  the  existing  primitives.  An 
example  is  the  addition  of  epicycles  to  the  Ptolemaic  conception  of  planetary  motion. 
This  addition  usually  allows  for  the  solution  of  goals  that  couldn’t  be  solved  before. 
The  goals  that  could  be  solved  in  the  old  conceptualization  remain  solvable  and  yield 
the  same  answers  in  the  reconceptualization. 

3.  New  conceptualization  inconsistent  with  old  conceptualization 

An  example  is  the  shift  from  the  Ptolemaic  to  the  Galilean  conceptualization.  Some 
of  the  predictions  made  by  the  Ptolemaic  theory  were  no  longer  made  by  the  Galilean 
one.  However,  the  two  conceptualization  had  some  common  elements:  the  observed 
astronomical  data,  and  the  objects  in  the  solar  system. 

2.2.4  A  Knowledge  Level  Analysis  of  Reformulation 

Until  now,  we  have  discussed  what  a  reformulation  is,  and  how  to  generate  the  space 
of  possible  reconceptualizations.  We  now  turn  our  attention  to  the  role  reformulation 
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plays  in  an  intelligent  agent,  so  as  to  be  able  to  navigate  the  definability  structure  in  a 
goal-directed  manner.  Why  is  reformulation  an  intelligent  thing  to  do,  and  under  what 
conditions  should  an  agent  reformulate?  To  answer  these,  we  perform  a  knowledge  level 
analysis  of  a  reformulator.  This  requires  us  to  in  turn  answer  the  following. 

1.  What  goals  do  we  ascribe  to  a  reformulator? 

2.  What  other  beliefs  do  we  ascribe  to  it? 

3.  What  is  the  nature  of  the  background  knowledge  that  together  with  the  rationality 
principle  generates  the  appropriate  reformulation  behaviour? 

Note  that  a  reformulation  action  is  somewhat  different  from  a  traditional  action  like 
MoveBlock  that  changes  the  current  state  of  the  world.  A  reformulation  action  does  not 
the  change  the  world,  it  causes  the  agent  to  redescribe  the  world  its  head.  Because  of 
this  redescription,  an  agent’s  actions  in  the  world  might  be  affected.  An  agent  that  was 
unable  to  achieve  a  goal  in  a  previous  conceptualization,  might  be  able  to  achieve  it  in  a 
reformulated  version;  this  would  be  the  secondary  effect  of  a  reformulation  action.  Clearly, 
all  useful  reformulations  have  interesting  secondary  effects. 

The  goal  of  a  reformulator  is  to  redescribe  the  world  in  terms  of  primitives  (objects, 
functions  and  relations)  that  would  permit  the  effective  solution  of  a  specified  class  of 
problem-solving  goads.  The  beliefs  that  we  attribute  to  the  reformulator  include 

1.  an  initial  conceptualization  of  the  world. 

2.  the  correctness  constraints  described  by  the  problem-solving  goals. 

3.  an  initial  encoding  of  that  conceptualization  together  with  its  computational  prop¬ 
erties  with  respect  to  a  given  problem  solver. 

4.  the  effectiveness  constraints  imposed  by  the  environment. 

Whereas  a  conceptualization  is  fine-grained  enough  to  capture  correctness  constraints, 
it  is  too  coarse  to  model  computation.  To  describe  computational  constraints  on  the  so¬ 
lution  of  the  goal,  we  need  to  introduce  the  concept  of  an  encoding  of  a  conceptualization. 
An  encoding  is  simply  a  set  of  sentences  in  an  appropriate  language,  one  of  whose  models 
is  the  conceptualization.  Details  of  the  relationship  between  the  encoding  and  the  con¬ 
ceptualization  will  be  left  till  Section  3.  For  now,  we  will  assume  that  there  is  a  space  of 
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encodings  associated  with  a  conceptualization  and  that  effectiveness  constraints  are  with 
respect  to  computation  of  the  goal  relations  in  an  encoding. 

The  goal  of  the  agent  is  to  be  solve  its  goals  within  the  given  correctness  and  effective¬ 
ness  '-distraints.  The  impetus  to  reformulate  arises  out  of  the  fact  that  the  agent  cannot 
meet  these  constraints,  and  thus  decides  to  incrementally  reconceptualize  the  world  to 
achieve  them.  The  logical  problem  of  reformulation  is:  what  other  general  beliefs  do  we 
assign  to  the  reformulator  that  would  allow  it  to  deduce  the  new  conceptualization? 

To  see  what  is  needed,  consider  the  deduction  of  the  new  primitive  FoundingFather 
from  the  conceptualization  C\  in  Figure  1.1.  The  space  of  reconceptualizations  for  the  goal 
relation  SameFamily  is  in  Figure  2.4.  We  will  assume  that  the  time  and  space  constraints 
on  the  computation  of  the  goal  are  met  in  conceptualizations  in  which  the  primitive 
FoundingFather  occurs4 .  The  minimality  principle  of  making  just  enough  distinctions  to 
meet  the  correctness  and  goodness  constraints  dictates  the  choice  of  the  conceptualization 
{FoundingFather,  SameFamily}. 

Thus,  the  knowledge  that  we  attribute  to  the  reformulator  include:  knowledge  of 
the  space  of  reconceptualizations,  knowledge  to  determine  whether  or  not  a  particular 
conceptualization  and  a  particular  encoding  of  it,  meet  the  correctness  and  effectiveness 
constraints  respectively,  as  well  as  knowledge  of  a  principle  of  economy  in  the  choice  of 
conceptualizations.  This  would  logically  entail  a  new  choice  of  conceptual  elements  that 
solves  for  the  goal  within  the  given  computational  constraints. 

2.3  A  Syntactic  Account  of  Reformulation:  Theory  Change 

A  conceptualization  is  an  extensional  description  of  the  phenomenon  of  interest.  It  is 
typically  in  the  head  of  the  programmer;  she  communicates  it  to  the  agent  by  writing 
sentences  in  a  language  that  is  appropriate  to  that  conceptualization.  A  language  is  a 
set  of  sentences  with  a  specified  syntax  and  semantics.  The  syntax  of  a  language  defines 
the  sentences  legal  in  that  language.  The  semantics  of  a  language  defines  the  relationship 
between  the  sentences  and  the  programmer’s  conceptualization  of  the  world.  This  rela¬ 
tionship  is  called  an  interpretation:  it  consists  of  a  mapping  between  the  symbols  of  the 
language  and  the  objects,  functions  and  relations  in  the  conceptualization,  as  well  as  rules 
for  determining  the  truth  of  sentences  composed  of  these  symbols. 

4The  determination  of  whether  an  encoding  satisfies  some  effectiveness  constraint  is  discussed  in  Section 
3  2. 
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Father(a,  b ) 
Father(a,c) 
Father(6,d) 
Father(b,  e) 
Father(c,/) 
Father(c,$) 


Ancestor(a,  6) 
Ancestor(a,c) 
Ancestor(6,  d) 
A ncestor(b,  e) 
Ancestor(c,  /) 
Ancestor(c,ff) 
Ancestor(a,  d) 
Ancestor(a,c) 
Ancestor(a,  /) 
Ancestor(a,g) 


Figure  2.6:  The  Canonical  Encoding  for  Ci 


First-order  predicate  calculus  is  the  language  used  in  this  thesis.  We  could  use  a 
specialized  language  (e.g.  that  of  trees  or  graphs,  musical  scores,  flowcharts  and  electrical 
circuits)  to  encode  our  conceptualization.  However,  we  take  the  position  expressed  in 
[Hay81]  and  expanded  in  [MG84]  that  specialized  languages  can  be  understood  in  terms 
of  their  translations  into  first-order  theories.  To  encode  a  conceptualization  in  first-order 
predicate  calculus,  we  need  to  select  the  non-logical  symbols  that  denote  the  various 
elements  of  the  conceptualization.  One  such  choice  is  the  canonical  language  introduced 
in  Section  1.4. 

Definition  IS  The  canonical  encoding  of  a  conceptualization  C  is  in  the  canonical  lan¬ 
guage  Cc  and  lists  all  the  tuples  of  the  functions  and  the  relations. 

The  interpretation  function  for  canonical  encodings  is  particularly  straightforward:  it 
is  a  1-1  map.  The  canonical  encoding  for  the  kinship  conceptualization  C\  is  in  Figure  2.6. 
An  example  of  a  non-canonical  encoding  that  uses  a  canonical  language  is  in  Figure  1.3. 
Yet  another  non-canonical  encoding  for  C\  is  in  Figure  2.7.  Note  that  the  encodings  £3  and 
E\  differ  on  the  actual  definition  of  the  Ancestor  relation.  Both  encodings  however  have 
the  same  model:  the  conceptualization  C\.  Encoding  £4  of  C\  is  displayed  in  Figure  2.8. 
£4  differs  from  both  E\  and  £3  in  its  commitments  to  which  relations  are  primitive  and 
which  are  computed.  Whereas,  Father  is  explicitly  recorded  in  E\  and  £3  and  Ancestor 
is  computed  in  terms  of  it,  £4  makes  Ancestor  the  primitive  relation  and  defines  Father 
using  the  Ancestor  relation. 

The  exact  relationship  between  an  encoding  and  a  conceptualization  can  be  formalized 
using  the  framework  of  first-order  logic.  We  will  view  a  conceptualization  as  a  special  kind 
of  a  structure  (akin  to  a  model,  except  that  we  reify  functions  and  relations),  and  encodings 
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Father(a,b) 

Father(a,c) 

Father(b,d) 

Father(b,e) 

Father(c,f) 

Father(c^) 

Father(x,y)  =>  Ancestor(x,y) 

Father(x^)  A  Ancestor(z,y)  =>  Ancestor(x,y) 


Figure  2.7:  The  encoding  E3  of  C\ 


a 


Ancestor(a,b) 

Ancestor(a,c) 

Anc**tor(a,d) 

Ancestor(a,e) 

Ancestor(a^) 

Ance$tor(x,y)  A-i3 z.  Ancestor(x,z) 


Ancestor(b,d) 
Ancestor(b,e) 
Ance*tor(c,f) 
Ancestof(c^) 
Ancestor(a,g) 
=>  Father(x,y) 


Figure  2.8:  The  encoding  E4  of  C\ 
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as  axiomatizations  in  first  order  logic.  To  relate  a  conceptualization  C\  and  am  encoding 
£3,  we  find  am  interpretation  function  I  from  the  symbols  in  £3  to  the  objects  in  C\  such 
that  every  sentence  in  the  encoding  holds  in  the  conceptualization.  For  encoding  £3  the 
interpretation  I  is: 

/(a)  =  A 
/(b)  =  B 
/(c)  =  C 
/(D)  =  D 
/(e)  =  E 
/(f)  =  F 
/(g)  =  G 

/(Father)  =  {(AfB),(A,C)f(B,D),(B,E),(C,F),(C,G)} 

/(Ancestor)  =  {(A,B),(A,C),(A,D),(A,E),(A,F),(A,G),  fB,D),  (B,  E),(C,  F),(C,G)} 

An  encoding  E  models  a  conceptualization  C  =  (O,  7,11)  exactly  when  C  satisfies  all 
the  formulae  in  E  in  the  sense  defined  below. 

A  well-formed  formula  4>  €  E  holds  in  a  conceptualization  C  -  (0,7,11)  if  we  can  find 
an  interpretation  function  I  from  the  set  of  parameters  to  items  in  C  such  that 

1.  Every  V  quamtifier  symbol  is  assigned  the  set  O. 

2.  Every  constamt  symbol  c  is  assigned  an  element  c1  in  O. 

3.  Every  n-place  predicate  symbol  is  assigned  the  corresponding  rc- place  relation  from 

n. 

4.  Every  n-place  function  symbol  is  assigned  the  corresponding  n-place  function  in  7. 

We  deluxe  the  function  s0  that  maps  individual  variables  to  elements  in  the  set  O. 
Now  we  extend  sa  to  name  all  terms  in  C(C).  The  function  that  translates  a  term  to  the 
object  it  denotes  is  70.  It  is  defined  recursively. 

1.  For  each  variable  x ,  l0(x)  =  3„(z). 

2.  For  earh  constant  symbol  c,  !0(c)  =  cl. 

3.  If  ti,  tj,  are  terms,  70(/(*i , fj =  /;(7o(f  1  ),^o(h),  •  • 
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Figure  2.9:  Reconceptualization  and  Reencoding 
For  a  g  €  E\  C2  f=  Htvn(g)  if  and  only  if  ntem{C2)  (=  g 


Now  that  we  have  defined  how  terms  denote  objects,  we  can  specify  how  the  truth 
of  formulas  is  determined.  This  is  done  by  recursion  on  the  structure  of  the  well-formed 
formulas. 

1.  Base  case:  <j>  is  an  atomic  formula  and  is  of  the  form  P(ti,t 2, . . . t„ ).  We  say  that 
h/  <t>  if  and  only  #  (so(*i)iSo(<2),  •  ■  -.«o(<n))  €  P1 . 

2.  Recursive  case  1:  4>  is  of  the  form  )=[<(>  if  arid  only  if 

3.  Recursive  case  2:  <f>  is  of  the  form  A  <fo. 

if  and  only  if  f=/^i  and  ^=[<h- 

Now  that  we  understand  the  relationship  between  a  conceptualization,  and  it  encoding, 
we  can  state  the  connections  between  changes  in  conceptualization  and  changes  in  encoding 
more  formally.  Let  C2  — ►  Cx  be  the  articulation  theory  between  C\  and  C2.  Recall 

that  this  articulation  theory  contains  the  definitions  of  the  elements  in  C2  in  terms  of  those 
in  Cj.  Let  H,y„:  Ex  — *  Ej  be  the  mapping  between  the  encodings  E\  of  Cx  and  £2  of 
C2.  The  two  mappings  Unm  and  7?fyn  are  related  as  shown  in  Figure  2.9. 
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We  factor  changes  in  primitives  from  changes  in  their  encoding.  We  structure  our 
search  for  a  good  reformulation  as  a  search  through  the  space  of  conceptualizations  guided 
by  the  computational  constraints  checked  in  the  much  denser  space  of  encodings.  The 
reason  that  the  encoding  space  is  much  dense*-  is  that  not  all  shifts  in  encodings  correspond 
to  a  shift  in  conceptualization.  For  instance,  the  shift  in  encoding  from  E\  to  E3  or  from 
E 3  to  E4  cause  no  change  in  conceptualization.  The  differences  between  these  encodings 
surface  in 

1.  Choice  of  which  element  to  store  and  which  to  compute 

2.  The  actual  definition  of  an  element  in  terms  of  the  others 

These  are  changes  in  the  symbol  level  with  no  accompanying  change  at  the  knowledge 
level.  These  shifts  do  not  qualify  as  reformulations  within  our  framework. 

For  the  three  categories  of  conceptual  shifts  in  Section  2.2.2,  we  provide  encoding  shifts 
that  implement  them. 

1.  New  conceptualization  definable  in  terms  of  the  old  one 

The  shift  in  encoding  that  achieves  this  class  is  the  introduction  of  new  terms  with 
the  appropriate  definitions  and  the  re- axiomatization  of  the  old  encoding  using  these 
new  (eliminable)  defined  terms.  A  good  example  of  this  is  the  introduction  of  the 
FoundingFather  symbol  in  the  encoding  Ej  and  the  re- axiomatization  of  SameFamily 
in  terms  of  it,  to  generate  E 2. 

2.  New  conceptualization  consistent  with  the  old  one 

The  encoding  shift  that  accomplishes  this  is  the  introduction  of  new  non-eliminable 
terms  and  subsequent  re- axiomatization  using  these  terms.  The  new  encoding  is  a 
consistent  extension  of  E\. 

3.  New  conceptualization  inconsistent  with  old  conceptualization 

Implementing  this  involves  dropping  and  adding  new  terms  as  before  and  re- axiomatb.at ions 
that  make  non-monotonic  changes  to  the  encoding. 


2.4  A  Catalogue  of  Examples 

Reformulation  is  a  diverse  phenomenon  as  the  following  set  of  examples  indicate.  In  all 
these  cases,  we  note  that  there  is  a  change  in  the  basic  terms  used  to  describe  a  problem. 
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A  classic  example  is  rewriting  the  missionaries  and  cannibals  (M  and  C)  problem  [Ama68] 
phrased  in  terms  of  individuals  into  a  formulation  that  is  based  on  the  cardinalities  of  the 
sets  of  missionaries  and  cannibals.  This  is  a  reformulation  that  improves  the  computational 
efficiency  of  solution  of  the  M  and  C  problem.  Other  examples  are 

1.  The  Copemican  theory  of  planetary  motion  is  a  reformulation  of  the  geocentric 
theory.  Both  theories  assume  the  same  objects  and  functions.  However  the  relations 
among  the  objects  assumed  differ  in  the  two  systems.  The  conceptual  shift  preserves 
the  observed  data  about  the  motion  of  the  various  planets.  This  reformulation  is 
akin  to  a  shift  of  origin  in  coordinate  geometry. 

2.  Reformulation  of  a  theory  in  terms  of  another.  In  his  thesis  on  reformulation  in 
1976,  Mark  [Mar76a],  reformulates  the  managerial  problem  of  hiring  in  a  firm  in 
terms  of  Keynesian  economics.  Minsky  argues  [Min86]  that  this  is  the  way  humans 
understand  new  things:  by  casting  them  in  terms  of  familiar  theories.  Most  engineers 
map  problems  in  heat  conduction  into  corresponding  problems  in  analog  circuits 
since  both  these  domains  share  the  same  behavioral  abstractions  [Gre85]. 

3.  Granularity  shifts 

(a)  Temporal:  Shifting  between  instant  and  interval  representations  of  time  is  es¬ 
sential  for  building  efficient  planners  that  deal  with  time. 

(b)  Spatial:  A  road  is  viewed  as  a  line  for  the  purposes  of  planning  a  trip,  a  surface 
when  one  crosses  it  and  as  a  volume  for  digging  it.  Most  reasoners  about  the 
common  sense  physical  world  need  to  be  able  to  shift  between  these  views  of 
space  when  appropriate.  Jerry  Hobbs  [Hob85]  outlines  a  scheme  whereby  we 
can  capture  the  connections  between  the  views  as  logical  theories. 

(c)  Aggregation  of  objects:  the  full  adder  abstraction  is  an  example  from  digital 
circuits.  The  notion  of  Thevenin  and  Norton  equivalents  in  analog  circuits  is 
a  reformulation  that  aggregates  a  large  number  of  circuit  elements  into  one 
lumped  parameter. 

(d)  Equivalence  class  reformulations:  partitioning  the  integers  into  odd  and  even  as 
in  the  checkerboard  reformulation  and  the  introduction  of  “d”  in  propagation 
of  faults  in  digital  circuits  are  examples  of  this  phenomenon. 
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4.  The  Aha!  Insight  reformulations.  These  include  reformulations  like  the  Laplace  and 
Fourier  transforms,  the  star-delta  transformation  vin  analog  circuits,  as  well  as  the 
polar- rectangular  coordinate  transformations  in  which  there  is  no  loss  of  information 
across  the  conceptual  shift.  These  are  impossible  to  automate. 

5.  Making  symmetries  explicit 

6.  Control  reformulations:  An  example  is  the  reformulation  that  transforms  the  naive 
formulation  of  the  Fibonacci  function  which  has  exponential  complexity  to  a  tail- 
recursive  one  that  is  linear.  These  reformulations  are  called  control  reformulations 
because  they  can  be  equally  well  implemented  by  changing  the  control  strategy  of  a 
problem  solver  while  keeping  the  formulation  intact. 

7.  Change  of  perspective:  these  involve  changing  what  constitutes  the  “figure”  and  the 
“ground”  elements  in  a  conceptualization.  A  good  example  is  the  re-conceptualization 
of  the  8  puzzle  where  the  actions  are  initially  expressed  in  terms  of  the  tiles,  into 
one  where  the  actions  are  expressed  in  terms  of  the  movements  of  the  blank  tile. 

8.  Data  structure  reformulations:  These  cover  pure  encoding  transformations  with  no 
change  in  conceptualization.  Ordering  conjuncts  in  a  query,  storing  a  relation  in  a 
hash  table  instead  of  a  list  are  examples. 

9.  Notational  variants:  a  classic  example  is  the  reformulation  of  Roman  numerals  to 
Arabic.  These  are  very  hard  to  discover  automatically,  because  the  space  of  nota¬ 
tional  variations  is  hard  to  describe. 

10.  Structure-function  reformulations:  Minsky’s  arch  example  is  an  instance.  If  arches 
are  described  in  structural  terms  (blocks,  and  the  support  relations  between  blocks), 
it  would  be  difficult  for  a  system  to  recognize  structurally  different  arches  as  in¬ 
stances  of  the  same  basic  concept.  If  however,  arches  were  described  in  terms  of 
the  functionally  motivated  predicates  Body  and  Support  which  stand  for  the  top  and 
bottom  parts  of  an  arch,  recognition  would  be  trivial. 

11.  Reformulations  that  cause  compression  of  reasoning  chains:  these  generally  do  not 
involve  introducing  new  objects  and  include  constant  folding  and  loop  jamming  from 
compiler  optimisations  and  computing  pre joins  in  database  query  optimisations. 
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Another  example  is  compiling  a  rule  set  for  diagnosing  a  circuit  from  a  description 
of  its  structure  and  behaviour. 

12.  Reformulations  from  an  objective  to  a  subjective  ontology  of  the  world.  Subjective 
ontologies  have  been  used  by  Agre  and  Chapman  [AC87]  as  well  as  Brooks  [Bro87] 
to  build  agents  that  act  in  the  world  in  real  time.  Subjective  ontologies  are  shown 
to  be  reformulations  of  the  objective  ontology  of  the  world  assumed  by  the  situation 
calculus  [SW89]. 

2.5  Taxonomy  of  reformulations 

Reformulations  are  of  two  basic  kinds:  inductive  and  deductive.  Deductive  reformulations 
lead  to  the  formation  of  new  conceptualizations  where  a  class  of  goals  solvable  in  the 
original  formulation  is  solved  faster.  Inductive  reformulations  are  those  in  which  the  new 
conceptualization  solves  goals  that  couldn’t  be  solved  before.  The  reconceptualizations  in 
the  missionaries  and  cannibals  problem  as  well  as  the  kinship  reformulation  are  deductive. 
An  example  of  an  inductive  reformulation  is  the  structure-function  example  from  above. 

For  deductive  reformulations,  we  can  specify  correctness  constraints,  viz.,  the  goals 
that  need  to  be  preserved  across  the  conceptual  shift,  as  well  as  the  goodness  constraints, 
viz.,  the  bounds  on  time  and  space  in  computing  the  goal  formulas  in  an  encoding  of  the 
reconceptualization.  In  inductive  reformulations,  the  clean  separation  between  correctness 
and  goodness  does  not  obtain.  Utility  is  the  chief  issue  (i.e.,  what  is  good  is  correct). 
Inductive  reformulations  are  described  in  [RS88].  We  focus  on  the  problem  of  automating 
deductive  reformulations  in  this  thesis. 

Deductive  reformulations  themselves  come  in  three  categories:  abstractions,  refine¬ 
ments  and  isomorphisms.  A  reconceptualization  C2  of  C\  is 

1.  An  Abstraction:  if  Definable- C[Cj,C\)  and  C2  is  a  correct  reformulation  of  Cj 
according  to  Definition  12  with  respect  to  some  goals  G.  The  kinship  reformulation 
of  Chapter  1  is  an  example. 

2.  A  Refinement:  if  Definable-C{C\yC2)  and  C2  is  a  correct  reformulation  of  C\  accord¬ 
ing  to  Definition  12  with  respect  to  some  goals  G.  Consider  the  following  formulation 
of  a  puzzle  called  the  hermit  puzzle.  A  hermit  starts  at  the  bottom  of  a  hill  at  8  am 
one  morning  and  climbs  to  the  top  by  5  pm.  He  returns  to  the  bottom  of  the  hill  the 
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next  day  at  5  pm  after  starting  his  journey  at  the  top  at  8  am.  The  goal  is  to  show 
that  there  is  some  point  on  the  path  up  the  hill  that  the  hermit  passed  at  the  same 
time  but  on  different  days.  Introducing  another  hermit  who  comes  down  the  hill  the 
same  time  that  the  first  one  starts  makes  the  solution  transparent.  The  new  con¬ 
ceptualization  introduced  an  object  into  the  formulation  that  cannot  be  constructed 
from  the  old  one:  it  is  thus  a  refinement  reformulation. 

3.  An  Isomorphism:  if  Definable-C(C2,Ci)  and  if  De finable- C(Ci ,C2 )  and  and  C2  is 
a  correct  reformulation  of  C\  according  to  Definition  12  with  respect  to  some  goals 
G.  The  rectangular  to  polar  coordinate  transformation  in  coordinate  geometry  is  an 
instance  of  an  isomorphic  reformulation. 

This  thesis  addresses  the  issue  of  automating  abstraction  reformulations  for  computational 
efficiency. 


2.6  Automating  Reformulation 

In  any  knowledge  base,  and  for  any  intelligent  agent,  it  is  essential  to  make 
the  right  distinctions  in  order  to  be  able  to  organize  and  cope  with  the  com¬ 
plexities  of  the  real  world.  In  order  to  know  what  constitutes  a  good  set  of 
individuals,  categories,  attributes  and  relations,  we  have  to  understand  how 
the  possession  of  for  example,  a  category,  in  one’s  vocabulary  assists  in  mak¬ 
ing  appropriate  decisions. 

from  Lenat  et  al  (eds),  The  Ontological  Engineer’s  Handbook,  2nd  ed., 
Addison- Wesley,  1997,  pi. 

There  are  two  parts  to  the  reformulation  problem:  the  epistemological  part  and  the 
heuristic  part.  The  epistemological  part  is  concerned  purely  with  what  constitutes  a 
correct  reformulation  of  a  problem,  i.e.  what  the  space  of  reformulations  is.  The  heuristic 
part  is  concerned  with  how  we  can  actually  generate  reformulations.  The  solution  to 
the  epistemological  part  is  a  proposal  for  a  generator  of  reformulations,  the  heuristic 
part  addresses  the  question  of  how  to  tame  this  generator.  Both  parts  are  important  for 
a  satisfactory  solution  to  the  problem,  but  the  epistemological  part  has  to  be  resolved 
before  the  heuristic  aspects  can  be  tackled.  The  role  of  an  epistemological  analysis  of 
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reformulation  is  to  tell  us  what  a  reformulation  is,  and  what  the  logical  character  of  a 
reformulation  inference  is. 

2.6.1  Formulating  the  Automation  Problem 

Given  that  reformulation  is  the  process  of  changing  distinctions  in  the  world,  and  that  it 
is  achieved  by  changing  theories,  we  can  now  frame  the  automation  question  for  deductive 
abstraction  reformulations. 

Given 

•  An  initial  conceptualization  C\  and  its  encoding  E\ 

•  Goals  G  (correctness  constraints) 

•  Problem  solver  PS:  description  of  behaviour  +  cost  model 

•  Computational  constraints  S  on  solution  of  goal  by  PS(goodness  constraints) 

Produce  a  new  abstract  conceptualization  Ci  and  implement  it  in  the  encoding  Ei  that 
meets  correctness  and  goodness  constraints. 

2.6.2  Previous  Work 

Much  of  the  earlier  work  on  reformulation  has  been  of  an  exploratory  nature.  The  most 
influential  piece  of  research  was  that  of  Saul  Amarel  in  1968  [Ama68].  Amarel  presented 
examples  of  reconceptualizations  and  re-encodings  for  computational  efficiency  in  the  mis¬ 
sionaries  and  cannibals  puzzle.  He  also  speculated  on  methods  for  their  mechanization. 
One  important  shift  in  this  puzzle  is  the  abstraction  of  the  named  individuals  into  ap¬ 
propriate  sets.  It  set  the  stage  for  the  creation  of  abstract  action  operators  that  make 
the  solution  of  the  problem  very  efficient.  This  reformulation  can  now  be  automated  by 
general  methods  proposed  in  this  thesis. 

The  difficulties  of  automating  reformulation  deterred  research  in  this  area  for  a  long 
while.  In  1980,  Korf  [Kor80]  attempted  to  characterize  the  nature  of  the  information- 
processing  that  occurs  during  special-purpose  reformulations.  He  defined  a  set  of  rewrite 
rules  on  encodings  that  set  up  a  space  of  possible  re-representations.  His  theory  explained 
representation  shifts  in  the  Tower  of  Hanoi  puzzle  as  well  as  some  examples  from  floor 
planning.  In  1981,  Jack  Mostow  designed  a  program  that  reformulated  advice  in  the  game 
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Figure  2.10:  The  Transformational  Approach  to  Reformulation 
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of  Hearts:  he  devised  400  transformation  rules  to  generate  the  space  of  reformulations. 
These  transformation  rules  were  a  set  of  relational  rewrite  rules  of  the  form:  expl  — *  exp2 
and  they  were  used  to  generate  new  encodings  from  the  given  one.  These  rules  formed 
the  generator  for  the  space  of  encodings  and  the  computational  constraints  were  used  as 
testers  to  determine  if  an  appropriate  encoding  £2  had  been  found.  The  architecture 
of  this  solution  is  sketched  in  Figure  2.10.  The  main  disadvantage  of  this  approach  was 
that  there  were  too  many  rewrite  rules  even  in  domains  of  medium  complexity  (Hearts) 
and  the  control  problems  were  formidable.  They  were  solved  by  having  the  user  guide  the 
reformulation  process.  Another  major  disadvantage  of  the  approach  was  that  there  was  no 
principled  way  of  actually  generating  these  rewrite  rules  from  more  global  considerations. 

In  1983,  Utgoff  [Utg86]  attacked  the  inductive  reformulation  problem  in  the  context 
of  the  LeX  [MTMB82]  problem  solver.  He  proposed  a  method  called  back-propagation  to 
refine  the  concept  of  integer  to  odd-integer  to  make  an  integration  heuristic  expressible. 
The  chief  insight  in  this  work  was  the  explicit  use  of  problem-solving  goals  to  guide  the 
addition  of  new  concepts.  The  addition  of  new  concepts  to  an  inductive  system  to  make 
prediction  efficient  was  studied  by  Fu  &  Buchanan  [FB85].  This  work  proposed  two 
general  heuristics  and  was  tested  empirically  in  the  domain  of  medical  diagnosis.  In  1986, 
Richard  Keller  [Kel87]  introduced  a  scheme  for  the  addition  of  new  concepts  into  the  LeX 
problem  solver  that  would  make  the  solution  of  some  queries  very  efficient.  He  used  a 
sophisticated  model  of  problem  solving  and  empirical  methods  for  testing  the  efficacy  of 
a  given  representation  against  a  class  of  goals. 

More  recently,  there  has  been  a  flurry  of  work  on  the  problem  of  automatic  intro¬ 
duction  of  new  vocabulary  to  make  problem  solving  and  machine  learning  more  efficient. 
Most  of  them  focus  on  very  special  classes  of  deductive  and  inductive  abstractions.  All  of 
them  pose  the  reformulation  question  at  the  level  of  encodings.  Patricia  Riddle  [Rid88] 
under  Saul  Amarel  has  attempted  to  automate  the  formation  of  deductive  abstractions 
of  problems  formulated  in  the  state-space  framework.  She  has  focussed  on  the  creation 
of  macro-operators  that  speed  up  the  solution  of  a  class  of  queries.  Mike  Lowry  [Low88] 
has  developed  abstraction  methods  based  upon  the  theory  of  abstract  data  types  to  syn¬ 
thesize  algorithms  bom  specifications.  Muggleton  [Mug88]  has  devised  a  machine  learn¬ 
ing  scheme  that  introduces  new  predicates  to  make  the  resulting  generalization  compact. 
Flann  [Fla88]  reformulates  theories  expressed  in  structural  terms  into  functional  terms  to 
make  a  recognition  task  more  efficient.  The  novel  aspect  of  this  work  is  the  use  of  examples 
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to  guide  selection  of  useful  functional  terms.  The  generation  of  functionally  useful  classes 
by  induction  from  problem  solving  traces  is  also  the  theme  in  Jeff  Schlimmer’s  work  on 
representation  change  [SP88]. 

2.6.3  The  Justification  Based  Approach 

Our  credo  is  that  automation  of  reformulations  is  not  possible  unless  the  reformulator  can 
justify  a  shift  in  conceptualization.  The  hope  is  that  if  the  justification  or  explanation 
of  a  reformulation  is  done  at  the  right  level  we5  can  exploit  it  to  automate  the  process. 
Standard  explanations  for  the  correctness  and  goodness  of  a  reformulation  are  at  too  low 
a  level  of  detail;  they  obscure  the  important  aspects  of  the  proof  that  are  needed  for 
generation.  This  is  the  same  insight  found  in  the  analysis  by  synthesis  work  in  the  domain 
of  electrical  circuits  by  Sussman[Sus77].  A  circuit  can  be  synthesized  by  writing  down  the 
equations  for  behaviour  in  terms  of  the  unknown  values  and  solving  for  them.  This  usually 
results  in  huge  systems  of  equations.  However,  we  can  use  knowledge  of  the  form  of  the 
answer  to  guide  us  in  setting  up  the  equations  cleverly,  so  they  cam  be  easily  inverted.  As 
Sussman  puts  it:  it  is  knowing  what  algebra  to  do  that  separates  a  good  circuit  designer 
from  a  bad  one.  The  explamation  framework  introduced  in  this  thesis  provides  a  way  of 
doing  algebra  on  formulations  in  a  clever  way. 

The  justification  based  approach  involves  asking  the  questions:  Why  is  the  concep¬ 
tualization  Ci  a  reformulation  of  C\  ?.  Why  is  Ej  a  reformulation  of  E\  ?  These  are 
explanation-seeking  why  questions  in  Hempel’s  [Hem65]  terms.  Our  object  in  doing  this 
is  to  articulate  the  knowledge  and  the  reasoning  that  goes  into  deliberate  reformulation 
so  that  we  can  compile  it  into  algorithms  for  automatic  reformulation.  But  what  does  an 
explanation  or  a  justification  for  a  reformulation  look  like? 

Reformulation  can  be  defined  as  the  process  of  inferring  a  new  conceptualization  and 
a  new  theory  in  that  conceptualization  that  preserves  the  goal  and  that  satisfies  the  given 
effectiveness  criteria  S  with  respect  to  a  problem  solver  PS. 

From  A,  Cu  Eu  9i,S  infer  C7 ,  E3,  g2, 

This  is  a  non-deductive  argument  in  that  the  conclusions  do  not  follow  syntactically 
from  the  premises.  The  justification  problem  then  is  to  find  a  criterion,  which  if  satisfied 
by  a  reformulation  inference,  sufficiently  establishes  the  truth  of  the  conclusions.  Our 
objective  is  to  find  general  principles  which  when  taken  with  background  knowledge  and 

*u  designer*  of  automated  reformulator* 
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added  to  the  premises  of  the  reformulation  inference,  make  the  conclusion  follow  soundly. 

The  goal  of  this  research  is  to  provide  a  normative  justification  for  a  reformulation 
inference  and  in  doing  so  provide  a  general  form  for  the  background  knowledge  needed 
to  draw  reformulation  conclusions  regardless  of  the  specific  method  used  to  derive  them. 
That  is,  we  are  interested  in  enumerating  the  space  of  reformulation  conclusions  starting 
from  an  initial  set  of  premises.  Some  criteria  on  the  nature  of  this  justification  knowledge 
are: 

•  Content 

The  justification  should  be  a  declarative  statement  of  the  factors  that  come  into  play 
in  the  choice  of  formulation:  what  the  semantics  of  that  choice  are  and  what  role 
they  play  in  the  problem  solving  process.  Much  of  this  knowledge  is  left  implicit 
in  present  day  systems,  so  when  computational  constraints  are  changed,  a  system 
cannot  realign  conceptual  boundaries  in  a  knowledge-based  way. 

•  Generality 

The  justifications  should  be  domain  and  problem-solver  independent.  This  does  not 
mean  that  they  will  insensitive  to  such  knowledge,  it  simply  necessitates  factoring 
out  as  much  of  the  domain  and  problem-solver  specific  information  as  possible  to 
facilitate  reuse. 

•  Generative  power 

The  justification  should  be  structured  in  such  a  way  that  it  can  be  used  to  generate 
new  formulations.  This  is  akin  to  the  technique  of  synthesis  by  analysis  in  analog 
circuit  design  [Van74j.  The  justifications  then,  are  not  purely  explanatory  in  nature; 
they  can  be  run  in  reverse  to  suggest  the  space  of  possible  reformulations  that 
satisfy  the  correctness  and  goodness  constraints. 

Justifying  changes  in  conceptualization  requires  that  we  able  to  justify  conceptual¬ 
izations  in  the  first  place.  A  conceptualization  partitions  the  universe  in  a  certain  way. 
A  justification  for  a  conceptualization  articulates  the  epistemological  and  computational 
consequences  of  assuming  those  distinctions.  A  justification  for  a  change  of  conceptual¬ 
ization  is  an  explanation  for  the  introduction  or  removal  of  some  conceptual  elements  for 
the  achievement  of  the  given  goals  with  new  resource  constraints.  These  justifications  are 
*u  suggested  by  Sussmsn  in  his  work  on  slices 
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New  Formulation  that  meets  correctness  and  goodness  constraints 
Figure  2.11:  The  Justification  Based  Approach 
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Irrelevance  _ Irrelevance 

Claims  Principle 


Reduction 

Formulation  1  - — - ►Formulation  2 

Inference 


Figure  2.12:  The  Relevance  of  Irrelevance 


thus  meta-theoretical.  They  attempt  to  answer  why  the  current  formulation  fails  to  meet 
the  specified  computational  constraints. 

Justifications  for  reformulations  that  satisfy  Definition  12  have  two  parts  to  them:  the 
correctness  proof  that  guarantees  that  the  new  formulation  obtains  the  same  answers  on 
the  given  set  of  goals  and  the  goodness  proof  that  shows  that  the  new  formulation  has 
better  computational  properties  modulo  a  given  problem  solver.  These  explanations  in¬ 
voke  properties  of  the  current  conceptualization  as  well  as  the  present  encoding  to  explain 
the  epistemic  and  computational  role  of  the  distinctions  being  made.  The  advantage  of 
these  meta-theoretical  justifications  is  that  they  tie  the  change  in  formulation  directly  to 
a  change  in  computational  properties.  One  class  of  such  explanations  are  irrelevance  jus¬ 
tifications.  We  then  use  our  knowledge  of  conceptualizations,  encodings,  and  the  problem 
solver  to  redesign  the  formulation  to  meet  the  new  constraints. 

2.6.4  Irrelevance  Justifications 

One  of  the  simplest  roles  a  distinction  could  play  is  that  it  has  no  part  or  has  a  dispensible 
part  in  the  computation  of  a  set  of  queries.  We  then  say  that  the  vocabulary  term  is 
irrelevant  to  the  particular  set  of  queries.  An  irrelevance  justification  for  an  abstraction 
reformulation  explains  why  some  conceptual  elements  were  expendable  and  why  some 
distinctions  can  be  collapsed  to  get  more  abstract  terms.  The  irrelevance  principle  states 
that  a  formulation  should  be  changed  to  eliminate  all  distinctions  irrelevant  to  the  present 
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goals.  It  thus  sanctions  the  inferences  that  logically  minimize  distinctions  made  in  a 
formulation.  Thus,  we  can  integrate  momentary  objects  into  appropriate  intervals  as  well 
as  individual  points  in  space  into  lines.  The  compression  of  particulars  into  a  universal,  the 
subject  of  machine  learning,  is  also  governed  by  the  same  principle:  generalization  occurs 
because  we  discard  distinctions  that  do  not  have  predictive  utility.  At  the  meta- theoretic 
level,  the  irrelevance  principle  sanctions  a  move  towards  conceptualizations  and  encodings 
that  make  no  irrelevant  distinctions  as  in  Figure  2.12.  The  principle  simplifies  computation 
in  a  theory  by  making  the  objects  assumed  by  it  as  few  and  as  large  as  is  consistent 
with  the  correctness  constraints.  This  ontological  economy  in  describing  problems  entails 
computational  savings.  We  only  make  distinctions  necessary  for  the  purpose  at  hand.  This 
informal  idea  expounded  in  [Qui63,Har86]  among  others  is  made  precise  in  the  succeeding 
chapters  so  that  we  can  design  a  machine  that  obeys  this  principle. 


Chapter  3 

The  Theory  of  Irrelevance 


3.1  Introduction 

Often,  wisdom  is  knowing  what  to  ignore.  An  autonomous  resource-limited  agent  with  a 
very  detailed  theory  of  the  world  should  be  able  to  reformulate  it  to  a  simpler  theory  that 
allows  it  to  make  predictions  at  the  required  level  of  accuracy  within  the  given  resource 
constraints.  Such  an  agent  has  to  identify  distinctions  made  in  its  conceptualization 
of  the  world  that  are  irrelevant  to  the  class  of  predictions  it  is  designed  to  make,  and 
weaken  its  theory  by  removing  irrelevant  distinctions.  The  theory  of  irrelevance  provides 
a  logical  basis  for  justified  discarding  and  ignoring  of  information  and  the  construction  of 
computationally  effective  theories  from  detailed,  intractable  ones. 

3.2  Motivations 

There  is  too  much  information  in  the  world  and  an  intelligent  agent  has  to  focus  selectively 
on  it  and  structure  it  in  effective  ways  to  achieve  its  goals.  An  agent  thus  needs  to  reason 
about  what  information  can  be  ignored  and  why.  Removing  irrelevant  information  has 
important  consequences,  both  computational  and  epistemic.  In  Abstrips  [Sac74],  the 
ignoring  of  preconditions  of  lower  criticality  while  attempting  to  achieve  am  abstract  plan 
at  a  certain  criticality  level,  leads  to  overall  efficiency  in  the  planning  process.  In  case- 
based  expert  systems  for  medical  diagnosis,  the  introduction  of  additional  evidence  often 
degrades  accuracy  of  performance  because  the  number  of  spurious  matches  increases.  The 
quality  of  amswers  and  explanations  obtainable  from  a  knowledge-based  system  is  improved 
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if  these  systems  are  endowed  with  the  ability  to  explicitly  reason  about  what  to  ignore  in 
a  goal-sensitive  way. 

The  theory  of  irrelevance  is  a  tool  for  specifying  as  well  as  deriving  classes  of  infor¬ 
mation  that  can  be  ignored  in  the  context  of  particular  goals.  For  instance,  in  Dendral, 
there  are  two  classes  of  mass  spectrograph  points  that  are  ignored  for  the  purposes  of  the 
structure  interpretation  task.  Scientists  have  much  sharper  criteria  for  the  data  points 
to  ignore  rather  than  the  data  points  to  include.  Specification  of  irrelevance  claims  is  a 
valuable  mode  of  expressing  knowledge  about  a  domain. 

Another  important  reason  for  building  the  ability  to  reason  about  irrelevance  into 
systems  is  that  we  would  like  to  give  advice  about  irrelevance  of  entities  to  a  problem¬ 
solving  system.  For  example,  in  the  missionaries  and  cannibals  problem,  we  would  like 
to  tell  the  problem- solver  that  the  names  of  the  missionaries  and  cannibals  are  irrelevant, 
and  have  the  system  clump  the  missionaries  and  cannibals  into  sets  and  reason  with 
the  cardinalities  of  these  sets.  Amarel  indicates  this  sort  of  reasoning  in  his  well-known 
1968  paper  [Ama68].  Removing  irrelevant  facts  and  objects  from  a  formulation  is  an 
important  method  of  changing  representations.  Yet  another  motivation  for  reasoning  about 
irrelevance  is  the  need  for  problem-solving  systems  to  reason  flexibly  at  varying  grain  sizes 
[Hob85].  These  systems  require  the  ability  to  recognize  and  ignore  detail  irrelevant  to  their 
current  goals  in  order  to  shift  to  a  bigger  grain  size  where  those  goals  can  be  achieved  more 
efficiently. 

Reasoning  about  irrelevance  is  equally  important  in  learning  and  theory  formation. 
Minsky  explains  the  role  of  irrelevance  in  learning  very  powerfully  in  the  following  excerpt 
from  the  The  Society  of  Mind. 

. we  never  ever  face  the  same  apperance  twice  of  anything.  We  are 

almost  certain  the  next  time  to  be  looking  from  a  different  viewpoint,  nearer  or 
further,  higher  or  lower,  in  a  different  colour  or  against  a  different  background. 

So  unless  our  minds  simplify  away  the  inessential  aspects  of  each  scene,  we 
could  never  learn  anything. 

Reasoning  about  irrelevance  can  thus  be  used  as  a  basis  for  focusing  attention  in  both 
inductive  and  deductive  tasks.  In  induction,  irrelevance  claims  bias  the  learner  towards 
the  construction  of  simpler  and  computat’  •mally  more  effective  generalizations.  In  deduc¬ 
tive  tasks,  irrelevance  statements  help  focus  search  by  identifying  unfruitful  or  redundant 
paths.  They  also  help  restructure  the  search  space  by  introducing  new  primitives. 
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3.2.1  Guide  to  Chapter 

To  this  end,  this  chapter  (which  is  an  extension  of  [SG87] )  introduces  meta-level  state¬ 
ments  about  irrelevance  that  allow  us  to  specify  which  distinctions  in  a  formulation  can  be 
dispensed  with  on  logical  or  utilitarian  grounds  with  respect  to  a  given  task.  We  then  out¬ 
line  the  semantics  and  properties  of  statements  about  irrelevance.  We  present  a  theoretical 
framework  for  irrelevance  and  develop  a  hierarchy  of  logics  that  capture  different  senses  of 
irrelevance.  The  logics  of  irrelevance  serve  as  a  language  for  specifying  irrelevance  claims 
in  the  world  and  their  associated  calculi  allow  us  to  draw  new  irrelevance  conclusions 
from  given  ones.  We  present  proofs  of  irrelevance  in  the  logics.  Efficient  graph- theoretic 
compilations  of  some  of  the  calculi  are  also  given. 

3.3  Informal  Semantics  of  Irrelevance 

A  fact  f  is  irrelevant  to  the  goal  schema  g  in  the  context  of  a  set  of  sentences  T,  written 
as  Irrelevant^!  ,g,T),  if  perturbing  the  value  of  f  in  T  does  not  affect  that  of  g.  Informally, 
the  following  conceptual  derivative  is  calculated. 

Irrelevant^  ,g,  T)  =  It  =  °) 

This  is  an  exact  irrelevance  claim.  Approximate  irrelevance  claims  are  those  in  which  the 
above  derivative  does  not  equal  zero,  but  some  c  very  close  to  zero. 

Now  we  present  some  examples. 

1.  The  Price  of  Tea  in  China 

Given  our  current  knowledge  of  economic  theory,  the  price  of  tea  in  China  is  irrelevant  to 
my  writing  this  thesis.  This  is  an  exact  irrelevance  claim.  Even  if  we  changed  the  value  of 
the  price  of  tea  in  China  in  our  theory  of  the  world,  the  change  would  not  propagate  to  the 
fact  about  my  writing  this  thesis.  The  irrelevance  claim  is  a  fact  about  the  relationship 
between  two  facts  in  our  theory  of  the  world.  There  are  two  possible  meanings  to  this  meta- 
relation.  One,  even  if  we  simplified  our  theory  of  the  world  by  discarding  information  about 
the  price  of  tea  in  China,  the  truth  value  of  the  proposition  about  my  writing  this  thesis 
would  not  be  affected.  This  is  the  subtractive  semantics  of  irrelevance.  Its  counterpart  is 
the  additive  semantics  that  says  that  even  if  we  added  information  about  the  price  of  tea 
in  China,  it  would  not  help  us  conclude  anything  more  about  my  thesis  writing.  Since 
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the  relations  Price-of-Tea  and  Write-Thesis  are  logically  independent  in  our  theory  of  the 
world,  these  two  interpretations  of  irrelevance  are  identical. 

2.  The  H;  brid-pi  model 

As  an  example  of  an  approximate  irrelevance  claim  consider  the  following.  For  the  purposes 
of  computing  the  low  frequency  gain  of  a  transistor  using  the  hybrid-pi  model,  the  base- 
emitter  and  base-collector  capacitances  are  irrelevant.  These  two  components  complicate 
the  analysis  of  the  circuit  without  providing  a  commensurate  increase  in  accuracy  for 
the  low  frequency  case.  Under  these  conditions,  we  would  like  to  ignore  the  effects  of 
the  capacitances  and  simplify  the  model  by  dropping  them.  In  the  previous  example,  the 
elimination  of  the  Price-of-tea  relation  did  not  influence  the  truth  of  the  proposition  about 
my  thesis:  here  removing  the  capacitors  causes  the  value  of  the  gain  to  change,  but  not 
significantly. 

Notice  that  the  irrelevance  claim  about  transistors  is  a  conditional  one,  it  is  true  only 
for  frequencies  less  than  50  Hz.  Also,  unlike  the  previous  example,  am  object  (a  base- 
emitter  capacitor  in  the  hybrid-pi  model)  is  specified  as  irrelevant.  We  now  consider  what 
it  means  for  an  object  to  be  irrelevant.  A  subtractive  semantics  for  object  irrelevance  is: 
even  if  we  removed  the  object  from  the  model  of  the  theory,  we  would  still  be  able  to  solve 
for  the  goal.  An  additive  semantics  for  object  irrelevance  is:  addition  of  the  object  does 
not  tell  us  any  more  about  the  goal  (in  particular,  it  does  not  make  a  previously  unsolvable 
goal  solvable).  Object  irrelevance  can  be  treated  as  a  special  case  of  fact  irrelevance;  f 
would  then  be  the  statement  that  an  object  with  the  requisite  properties  exists. 

The  irrelevance  claim  above  can  be  determined  by  a  met  a- theoretical  analysis  of  the 
equations  for  calculating  the  gain  using  the  hybrid-pi  model.  Under  the  low  frequency 
condition,  the  capacitive  terms  are  second-order  effects:  an  order  of  magnitude  reasoning 
over  the  various  terms  that  contribute  to  the  gain  shows  that  the  capacitive  terms  have  a 
negligible  effect.  Proving  irrelevance  claims  is  a  creative  endeavour:  in  this  case  the  proof 
requires  doing  a  sensitivity  analysis  of  the  gain  equations. 

Removing  an  object  from  the  model  entails  modifying  the  theory  so  that  the  existence 
of  the  object  can  no  longer  be  deduced.  In  model- theoretic  terms,  this  entails  modifying 
the  Herbrand  base  to  exclude  the  object.  In  terms  of  revising  the  theory,  this  amounts, 
in  the  simplest  case,  to  removing  all  references  to  the  object  in  the  theory  (akin  to  dead 
code  elimination  in  compiler  optimizations).  In  our  example  above,  we  can  simplify  the 
gain  equations  by  this  method.  In  more  complicated  cases,  we  have  to  remove  those  facts 
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that  assert  that  the  object  exists  as  well  as  those  that  depend  on  the  fact  that  the  object 
exists.  We  continue  with  more  examples. 

3.  Missionaries  and  Cannibals 

In  the  missionaries  and  cannibals  puzzle,  the  solution  does  not  ask  for  a  particular 
individual  to  reach  the  destination  bank  ahead  of  another.  In  the  3x3  problem,  there 
are  36  possible  solutions  that  differ  on  the  order  of  the  missionaries  and  cannibals  reaching 
their  destination.  This  number  is  derived  by  using  the  fact  that  every  permutation  of  the 
missonaries  amongst  themselves  as  well  as  the  cannibals  amongst  themselves  transforms 
a  valid  solution  into  another  valid  solution.  The  fact  that  the  order  is  irrelevant,  allows 
us  to  erase  the  identities  of  the  individuals  and  clump  them  into  sets.  On  analyzing 
the  preconditions  of  the  action  operations  in  this  puzzle  (load- boat-left,  move-boat-from- 
left-to-right,  etc.),  we  see  that  the  preconditions  only  require  the  cardinalities  of  the  sets 
of  individuals  in  each  bank  and  the  boat.  This  gives  us  the  justification  to  discard  all 
attributes  of  the  sets  except  for  their  cardinality. 

Suppose  we  introduced  two  new  operators  Sit-Down  and  Stand-Up  that  act  on  indi¬ 
viduals.  Recall  that  the  goal  is  to  find  the  minimal  sequence  of  actions  that  achieves  the 
transfer  of  the  missionaries  and  the  cannibals.  We  can  state  the  fact  that  no  minimal 
sequence  of  actions  to  achieve  the  goal  uses  these  two  operators  by  making  the  claim  that 
these  operators  are  irrelevant  to  the  goal.  This  means,  even  if  the  operators  are  removed, 
the  same  solution  would  obtain.  The  irrelevance  claim  captures  an  important  property  of 
the  problem  space  intensionally  .  This  particular  irrelevance  claim  can  be  discovered  by  a 
local  analysis  of  the  preconditions  of  operators  in  time  linear  in  the  number  of  operators. 

4.  Abstrips 

The  essence  of  the  Abstrips  approach  for  controlling  search  in  planning  is  to  use  a 
means  for  distinguishing  details  from  essential  aspects  of  the  problem  space  [Sac74].  By 
planning  in  a  hierarchy  of  abstract  problem  spaces  ordered  by  the  amount  of  detail  in  them, 
and  by  introducing  detail  in  a  top-down  fashion,  a  search  space  which  is  exponential  in 
the  number  of  operators  is  reduced  to  one  that  is  polynomial  in  complexity. 

In  contrast  with  all  the  irrelevance  claims  above,  the  claims  in  Abstrips  cannot  be 
deduced  from  a  description  of  the  most  detailed  space.  The  hierarchy  embodies  knowledge 
about  what  is  detail  and  what  isn't  at  particular  points  in  the  planning  process,  and  this 
knowledge1  is  expressed  in  the  irrelevance  claims.  By  changing  the  criteria  for  what 
‘which  is  not  expressed  in  the  detailed  domain  theory! 
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constitutes  detail,  Abstrips  can  construct  abstraction  spaces  that  axe  computationally 
useful  for  the  task  at  hand. 

For  instance,  the  Turn  on  the  lamp  operator  is  represented  in  STRIPS  notation  [Sac74], 


Preconditions  : 

Type(l,  LAMP)  A  3rx. 

(1) 

lnroom(M£,  rx)  A  lnroom(l,  rx) 

(2) 

Plugged  -  ln(/)A 

(3) 

Nextto(MF,l) 

(4) 

Addlist  : 

On(l) 

Deletelist  : 

om 

The  predicates  in  the  preconditions  are  ordered  in  decreasing  order  of  criticality.  At 
-'riticality  1,  the  predicates  of  lower  criticality  (Inroom,  ....,  Nextto)  are  ignored.  Translat¬ 
ing  this  into  state  space  terms,  all  states  that  differ  purely  on  the  values  of  the  predicates 
of  lower  criticality  are  treated  as  identical.  This  is  the  assertional  import  of  the  criticality 
assignment;  it  clumps  whole  subgraphs  in  the  state  space  graph  into  a  single  node. 

The  information  contained  in  the  criticality  assignments  can  be  given  a  clean  semantics 
by  re-expressing  them  as  irrelevance  claims.  This  also  allows  us  to  declaratively  specify 
the  criterion  of  irrelevance  which  can  change  from  task  to  task  and  permits  the  automatic 
derivation  of  the  required  abstraction  hierarchy.  Abstrips  uses  the  following  (procedurally 
expressed)  criterion  to  sift  detail  from  the  essentials.  If  a  predicate  (that  expresses  a 
condition  of  the  world)  is  easier  to  achieve  than  another,  it  deems  the  first  predicate  to 
be  an  irrelevant  detail  and  drops  it  from  consideration. 

To  formalize  the  above,  we  need  to  enrich  our  simple  definition  of  irrelevance  to  include 
an  ordering  criterion.  If  we  have  two  fact  schemas  fi  and  fj  that  are  irrelevant  to  the  same 
goal-schema  g,  we  can  impose  the  “Is-More-Irrelevant-Than”  ordering  on  fi  and  fj.  To 
automate  the  construction  of  the  particular  hierachy  that  the  designers  of  Abstrips  chose, 
we  first  define  the  semantics  of  this  ordering  to  be  ease  of  achievability.  We  then  construct 
the  abstraction  hierarchy  bottom  up  by  dropping  the  least  element  of  this  partial  order 
first,  and  proceed  iteratively,  until  all  the  elements  in  the  order  are  covered. 

For  example,  suppose  we  have  a  theory  T  with  two  predicates  fi  and  fj,  and  we  also 
know  that  in  the  context  of  achieving  goal  schema  g,  f\  is  more  irrelevant  than  fj.  Then 
the  irrelevance  minimization  method  would  construct  a  2  level  abstraction  hierarchy  whose 
top  level  is  the  theory  Tj  which  ignores  f i ,  and  whose  bottom  level  contains  T.  In  effect, 
we  perform  the  following  inference. 
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Description  of  detailed  space  +  Specification  of  Is-More-Irrelevant-Than  claims  for  a 
given  class  of  problems  The  Abstraction  Hierarchy. 

The  irrelevance  minimizer  guarantees  that  at  each  level  the  abstract  theory  ignores 
detail  that  is  irrelevant  for  that  level  as  well  as  the  levels  above  it. 

5.  Founding  Fathers 

Consider  the  following  kinship  example.  We  start  with  trees  of  Father  relations.  The 
defined  relation  Ancestor  is  the  transitive  closure  of  the  Father  relation.  The  goal  is  the 
SameFamily  relation;  two  people  belong  to  the  same  family  if  they  have  a  common  an¬ 
cestor.  If  a  simple  backward  chaining  system  like  Prolog  worked  on  this  formulation  of 
the  problem,  SameFamily  queries  would  take  time  proportional  to  the  height  of  the  tree.2 
Suppose  we  wish  to  reformulate  this  problem  in  order  to  be  able  to  solve  for  SameFamily 
queries  in  constant  time  with  O(n)  extra  space  where  n  is  the  number  of  people  in  the 
family  trees. 

There  are  two  irrelevance  claims  that  allow  us  to  accomplish  the  reformulation. 

1.  The  distinction  between  immediate  and  non-immediate  ancestry  (i.e.,  between  Father 
and  Ancestor)  is  irrelevant  to  the  SameFamily  query. 

2.  The  identity  of  the  common  ancestor  is  irrelevant  to  the  SameFamily  query. 

While  the  irrelevance  claims  in  Abstrips,  collapsed  states  in  the  state  space  into  equiv¬ 
alence  classes  modulo  the  plan  to  be  achieved  at  a  certain  level  of  abstraction,  the  claims 
in  this  example  identify  redundant  paths  in  the  search  space.  If  the  Ancestor  fact  corre¬ 
sponding  to  a  given  ground  Father  fact  were  available  in  the  formulation,  there  would  be 
two  alternate  ways  of  concluding  SameFamily:  one  that  uses  Father(x,y)  ==>  Ancestor(x,y) 
and  the  ground  Father  fact,  the  other  that  terminates  on  the  ground  Ancestor  fact.  The 
irrelevance  claims  sanctions  the  construction  of  a  weakening  of  the  formulation  by  drop¬ 
ping  the  Father  relation.  This  is  equivalent  to  relabelling  the  Father  trees  as  the  Ancestor 
trees  in  the  formulation.  Note  that  model-theoretically,  we  get  rid  of  the  distinction  be¬ 
tween  Ancestor  and  Father.  Proof-theoretically,  all  proofs  of  Same-Family  are  shortened 
by  one  step.  And  in  terms  of  the  search  space,  this  relabelling  causes  all  branches  that 
result  from  expanding  the  first  axiom  in  Figure  3.1  are  pruned.  The  irrelevance  claim 
has  a  computational  impact.  To  capture  this  notion  precisely,  we  refine  the  notion  of 
irrelevance  introduced  in  the  beginning  of  the  chapter  as:  f  is  computationally  irrelevant 
V  .  .  if  they  succeed.  If  not.  the  interpreter  generates  an  infinite  subgoaJ  sequence. 
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a 


Father(a,b) 

Father(a,c) 

Father(b,d) 

Father(b,e) 

Father(c,f) 

Father(c,g) 

Father(x,y)  =>  Ancestor(x,y) 

Ancestor(x,z)  A  Ancestor(z,y)  ==>  Ancestor(x,y) 

Ancestor(zjc)  A  Ancestor(z,y)  SameFamily(x,y) 

Figure  3.1:  The  Given  Formulation 

to  g  in  the  context  of  T  if  the  conceptual  derivative  of  g  with  respect  to  f  in  T  is  zero  and 
the  simplification  of  T  that  is  constructed  has  better  computational  properties. 

The  second  irrelevance  claim  has  a  stronger  computational  impact.  It  sanctions  the 
inference  that  allows  us  to  shortcircuit  all  the  Ancestor  links  upto  the  roots  of  the  trees. 
In  Section  3.5  we  will  state  this  irrelevance  claim  very  precisely.  This  claim  leads  to  the 
synthesis  of  the  maximal  Ancestor  relation.  The  mechanics  of  the  theory  revision  that 
accomplishes  it  is  the  subject  of  Chapter  4.  The  abstract  justification  for  both  these 
claims  is:  a  fact  schema  is  irrelevant  if  there  exists  an  alternate  fact  schema  that  reaches 
the  solution  without  it. 

6.  Granularity  Shifts 

The  width  of  the  road  is  irrelevant  to  the  length  of  the  path  from  SF  to  LA.  This  means 
that  two  points  that  differ  only  on  the  width  dimension  along  Rte  101  are  in  the  same 
equivalence  class  with  respect  to  distance  from  LA.  Model- theoretically,  this  allows  us  to 
project  out  the  width  component  from  the  Herbrand  base  of  a  theory  that  treats  the  road 
as  a  surface.  Points  on  the  road  in  this  theory  are  regarded  as  elements  in  H2  and  the 
irrelevance  claim  sanctions  the  simplification  to  a  theory  whose  Herbrand  base  contains 
the  projection  along  one  coordinate  which  are  elements  in  H.  Computationally,  this  is  a 
win  because  we  have  to  compute  the  length  of  the  line  joining  SF  and  LA  as  opposed  to 
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computing  the  average  of  such  lines  on  the  surface  connecting  SF  to  LA. 

Another  example  is  the  coarsening  of  a  discretization:  e.g.  we  may  wish  to  reformulate 
a  theory  which  distinguishes  between  temperatures  in  the  interval  [58.. 60]  to  one  that  does 
not.  This  requires  treating  distinct  objects  58,59,60  as  elements  of  a  single  equivalence 
class.  The  theory  in  which  they  occur  is  modified  so  that  the  fact  that  58  ^  59  /  60  can 
no  longer  be  deduced  in  it. 

7.  Abstraction  in  Digital  Circuits 

We  present  an  example  of  structural  abstraction.  The  idea  behind  this  is  that  the  spec¬ 
ification  of  the  device  should  not  reflect  its  internal  structure,  but  only  its  externally 
observable  behaviour.  Suppose  we  have  the  gate  level  description  of  a  full-adder  circuit 
as  in  Figure  2.1.  Suppose  further,  that  we  deem  the  exact  values  of  the  signal  at  the 
points  d,  e  and  /  irrelevant.  The  current  theory  of  the  full  adder  allows  us  to  deduce  the 
output  values  at  the  above  mentioned  points  in  the  circuit.  However,  all  we  care  about  is 
that  there  exists  a  value.  We  therefore  need  to  weaken  the  theory  to  make  these  values 
undeducible.  One  minimal  weakening  that  accomplishes  this  without  changing  the  values 
of  sum  and  carry,  changes  the  Herbrand  base.  It  introduces  a  new  object  called  the  Full- 
Adder  FA  as  in  Figure  2.2  which  has  3  inputs,  a,  b  and  c  and  outputs  sum  and  carry. 
We  compress  the  reasoning  chains  through  d,  e,  and  /  in  the  detailed  theory  and  express 
the  values  of  sum  and  carry  entirely  in  terms  of  a,  6,  and  c.  By  changing  the  assignment 
of  points  in  the  circuit  that  we  wish  to  make  irrelevant,  we  can  segment  the  circuit  in 
goal-sensitive  ways  to  get  good  abstractions. 

8.  Macrops 

The  circuit  example  above  as  well  as  the  kinship  example  are  instances  of  formation  of 
macroperators  in  the  search  space  by  the  elimination  of  irrelevant  intermediate  variables 
(circuit  points  in  the  former  and  ancestors  in  the  latter).  The  ignoring  of  intermediate 
states  leads  to  the  formation  of  macroperators  and  the  criterion  for  ignoring  them  can 
be  naturally  expressed  as  irrelevance  claims.  Macrops  are  a  very  fertile  area  of  research 
in  machine  learning  [REFJ81,Kor83].  The  difference  between  the  macrops  constructed  in 
this  thesis  and  the  standard  ones  in  literature  is  that  not  only  do  we  reconfigure  the  search 
space  by  eliminating  intermediate  steps,  we  also  propagate  the  change  into  the  formulation 
so  tb  *».t  the  new  formulation  does  not  generate  those  irrelevant  steps. 

9.  Selective  Forgetting 

Resource  limited  agents  have  limited  memory  and  have  to  decide  which  information  to 
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discard  and  which  to  keep.  Policies  for  selective  retention  of  information  or  forgetting 
can  be  expressed  as  irrelevance  claims.  For  instance,  an  agent  that  lives  in  a  world  in 
which  information  obtained  at  time  t  becomes  irrelevant  with  respect  to  its  goals  at  t  +  2 
could  be  told  it  as  an  irrelevance  claim.  This  would  sanction  an  inference  on  the  part  of 
the  agent  to  ignore  information  more  than  2  dock  ticks  old.  Throwing  away  information 
can  be  computationally  benefidal  in  other  contexts.  In  macro-operator  formation  in  a 
machine  learning  system,  the  accretion  of  macros  slows  the  system  down  [FN71,Min88]. 
The  rationale  for  discarding  macros  can  be  phrased  in  terms  of  the  utility- theoretic  version 
of  irrelevance,  called  computational  irrelevance. 

10.  Control  Reformulations 

Here  the  domain  of  discourse  is  over  computational  objects  like  proof  trees  and  computa¬ 
tion  sequences  and  the  irrelevance  daims  state  the  fact  that  there  is  wasted  computation 
going  on.  Consider  the  following  formulation  of  the  Fibonacd  function: 


Fib(n) 


'1  if  n  =  1; 

<  1  if  n  =  2; 

.  Fib(n  -  1)  +  Fib(n  -  2)  if  n  >  2. 


The  sequence  of  subgoals  generated  by  a  backward-chaining  system  like  Prolog  is: 
Fib(5) 

Fib(4)Fib(3) 

Fib(3)Fib(2)Fib(3) 

Fib(2)Fib(l)Fib(2)Fib(3) 

Fib(2)Fib(l)Fib(2)Fib(2)Fib(l) 

If  Fib(n)  has  already  been  computed  before  and  has  been  stored  away,  it  is  irrelevant  to 
compute  it  again  using  the  definition.  This  is  the  structure  of  the  explanation  for  reformu¬ 
lations  that  eliminate  repeated  computation.  If  the  objective  is  to  minimize  the  number 
of  compute  actions,  then  compiling  this  irrelevance  daim  into  the  formulation  requires 
rewriting  it  so  that  a  Fib  value  is  looked  up  rather  than  computed  whenever  possible. The 
new  relation  introduced  by  the  irrelevance  minimizer  caches  a  part  of  the  computation 
sequence  (e.g.  the  previous  two  Fib  values  at  any  point  in  the  Fib  computation). 

The  irrelevance  daim  in  this  case  is  a  conditional  one  that  says  that  if  there  is  a  stored 
value  for  a  Fibonacd  computation,  then  the  computation  of  that  value  in  the  context  of 
any  goal  is  irrelevant.  Notice  that  what  is  deemed  irrelevant  in  this  case  is  the  computation 
fit(r.)  from  the  definition.  We  can  express  the  meaning  of  this  claim  in  terms  of  the 
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fact  based  semantics  introduced  informally  in  Section  3.  To  do  this  we  need  to  distinguish 
computed  Fib’s  from  stored  Fib’s. 

This  reformulation  is  called  a  control  reformulation  because  the  entities  that  are 
deemed  irrelevant  are  computations.  Also  the  effect  of  the  reformulation,  viz.,  the  mini¬ 
mization  of  repeated  computation,  can  be  achieved  by  changing  the  problem  solver  while 
keeping  the  formulation  intact.  Forward  chaining  from  Fib(l)  and  Fib(2)  on  the  current 
formulation  until  we  obtain  the  required  Fib  value  is  an  optimal  computation.  Control 
reformulations  can  be  described  by  the  following  equation. 


F\  +  PS\  = 


Fi  +  PSi  Ad  irrelevance  reformulation; 

Fi  +  PS2;  Meta-level  Control  of  Inference 
where  Fi  and  F2  stand  for  the  current  and  the  new  formulations  respectively  and  PS\ 
and  PS2  are  the  given  and  the  modified  problem  solver.  Control  reformulations  move 
knowledge  from  the  interpreter  to  the  formulation  itself.  They  are  thus  information- 
gaining,  since  the  new  formulation  has  compiled  control  information  that  was  absent  in 
the  old  formulation. 


3.4  Taxonomy  of  Irrelevance 

There  are  atleast  two  axes  on  which  to  taxonomize  irrelevance  :  the  type  of  the  entity 
being  deemed  irrelevant,  and  the  sense  in  which  that  entity  is  irrelevant  to  a  goal.  The 
irrelevance  claims  in  the  Fibonacci  case  referred  to  wasted  computation,  the  claims  in  the 
kinship  example  referred  to  facts  that  could  be  dispensed  with.  For  each  type  of  f,  we 
can  define  the  perturbation  Af  that  is  legal.  The  irrelevance  claims  in  the  PriceofTea 
case  and  the  Hybrid-*  case  differ  on  the  exactness  of  irrelevance:  the  exact  claims 
=  0)  are  logical  irrelevance  claims.  Some  logical  claims  are  approximate.  An  approximate 
irrelevance  claim  asserts  that  ignoring  the  entity  deemed  irrelevant  is  a  computationally 
beneficial  thing  to  do  (irrespective  of  whether  or  not  is  logically  correct  to  do  so).  The 
type  of  perturbations  we  are  willing  to  consider  include 

1.  f  is  a  proposition  :  flip  the  value  of  f  in  the  theory 

2.  f  is  a  set  of  propositions  :  flip  the  value  of  any  f  in  that  set 

3.  f  is  an  object  :  remove  f  from  the  Herbrand  base  of  that  theory 

4.  f  is  a  set  of  objects:  remove  them  all  from  the  Herbrand  base 
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5.  f  is  part  of  a  proof  tree  or  search  space  :  prune  that  part  of  proof  tree  or  search 
space 

6.  f  is  an  action:  do  not  perform  action 

7.  f  is  an  argument  of  a  relation:  drop  that  argument 

Irrelevance  claims  can  be  probabilistic  too.  This  involves  attaching  probabilities  to  the 
claims:  most  independence  claims  in  the  physical  world  as  well  as  irrelevance  claims  in 
ill-structured  domains  like  medicine  have  this  flavour.  Probabilistic  claims  will  be  studied 
in  the  future.  In  this  thesis  we  limit  ourselves  to  exact  irrelevance  claims.  These  are  of 
two  kinds:  logical  and  computational.  A  logical  irrelevance  claims  allows  the  discarding  of 
information  and  guarantees  that  the  answers  to  the  given  set  of  goals  will  be  unaffected.  An 
approximate  irrelevance  claim  like  the  hybrid- x  case,  allows  the  discarding  of  information 
but  does  not  guarantee  that  the  answers  to  the  given  set  of  goals  will  remain  the  same.  A 
special  class  of  logical  irrelevance  claims  are  called  computational  irrelevance  claims  because 
not  only  do  they  guarantee  that  discarding  some  information  preserves  the  answers  to  some 
goals  of  interest,  but  that  throwing  away  some  information  causes  the  computation  of  the 
goals  to  proceed  faster. 


3.5  Formalizing  Irrelevance 

Inefficient  formulations  make  irrelevant  distinctions.  An  irrelevance  claim  expresses  a  jus¬ 
tification  for  an  abstraction  reformulation.  It  explains  why  some  conceptual  elements  in 
a  formulation  are  expendable  and  why  some  distinctions  can  be  collapsed  to  get  more  ab¬ 
stract  ones.  We  devise  a  logic  of  irrelevance  to  state  irrelevance  claims  and  to  derive  them 
mechanically  from  the  given  formulation.  We  do  so  by  introducing  a  relation  Irrelevant 
that  takes  3  arguments:  f:  the  thing  that  is  deemed  irrelevant,  g:  the  goal  and  T:  the 
theory  or  context  in  which  irrelevance  is  established. 

3.5.1  The  need  for  a  meta-theoretical  analysis 

If  we  were  asked  whether  or  not,  x999  +  x888  -|-  x777  -|- . . .  +  x111  is  divisible  by  x9  +  x8  -f 
x7  +  . . .  +  x1 ,  we  have  two  choices; 

1.  Perform  the  actual  division.  This  is  a  long  and  computationally  expensive  process 
which  results  in  the  generation  of  the  quotient. 
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2.  Utilize  the  fact  that  the  quotient  was  not  needed  at  all,  all  we  wanted  was  to  deter¬ 
mine  whether  or  not  the  polynomial  was  divisible.  There  are  sufficient  conditions 
for  divisibility  that  we  can  use  to  determine  this  property  that  use  easily  evaluable 
conditions  on  the  forms  of  the  divisor  and  dividend. 

The  same  motivation  underlies  the  logic  of  irrelevance;  it  gives  us  a  way  to  express 
and  use  information  about  dependencies  between  sets  of  facts  in  a  formulation  without 
carrying  out  the  detailed  proofs.  The  logic  of  irrelevance  allows  the  succint  expression  of 
the  fact  that  some  distinctions  can  be  dispensed  with  on  logical  or  utilitarian  grounds, 
it  also  justifies  the  discarding  of  distinctions  in  the  form  of  a  proof  of  irrelevance.  The 
logic  of  irrelevance  essentially  performs  a  qualitative  analysis  of  proofs  of  a  class  of  goals 
in  a  formulation  -  it  makes  possible  intensional  reasoning  about  proofs  and  about  formu¬ 
lations  in  which  they  arise,  without  exhaustively  enumerating  them.  When  the  domain  of 
discourse  is  search  spaces,  the  logic  of  irrelevance  allows  us  to  intensionally  reason  about 
pruning  search  paths. 

3.6  The  propositional  logic  of  irrelevance 

Here  the  arguments  of  the  ternary  relation  Irrelevant  are  restricted  to  be  propositions  or 
sets  of  propositions.  We  say  that  f  is  irrelevant  to  g  in  the  context  of  the  set  of  sentences 
T,  written  as  Irrelevant(f,g,  T),  if  changing  the  truth  value  of  f  in  T  does  not  affect  that 
of  g.  The  truth  value  of  f  in  T  is  changed  by  constructing  a  weakening  T’  of  T  that  no 
longer  supports  the  truth  of  f.  First,  we  define  a  weakening. 

Definition  14  The  set  T'  of  sentences  in  the  vocabulary  C  is  weaker  than  the  set  T  in 
the  same  vocabulary,  if  T  |=  T*  or  equivalently  if  Deductive- Closure  (\' )  C  Deductive- 
ClosurefT  ). 

Two  classes  of  weakenings  are  explored  in  this  thesis.  Both  use  the  subset  operation.  The 
first  class,  called  Type  1  weakenings  decrease  the  deductive  closure  of  the  set  without 
changing  the  Herbrand  universe.  Type  1  weakenings  are  characterized  by  Definition  1. 
Type  2  weakenings  collapse  propositions  by  constructing  equivalence  classes  of  them  and 
thus  change  the  Herbrand  base.  We  study  Type  1  weakenings  only.  The  irrelevance  claims 
described  are  called  weak  irrelevance  claims. 


Definition  15  WI\  (f,g,T )  =  T  (=  g  and  3  T*.  T’  f=  g  and  T*  £  f. 
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Example  1  Let  Ti  =  {I  t}.  Let  T2  =  {1,1  ^  t}  where  I  and  t  stand  for  the  propositions 
that  there  is  lightning  and  thunder  respectively. 

Using  Definition  15,  we  can  conclude  that  W7i(l,t,Ti)  since  we  can  construct  a  T* 
with  the  required  properties.  However,  we  cannot  show  that  W7i(l,t,T2).  This  is  because 
the  truth  of  t  in  T2  does  depend  on  that  of  I,  the  conceptual  derivative  is  not  0  in  T2. 

This  example  shows  a  very  interesting  property  of  WI\.  Even  though  Ti  and  T2  are 
identical  from  a  model-theoretic  viewpoint,  the  irrelevance  judgements  in  the  two  sets  are 
not.  Claims  about  Wij  are  sensitive  to  the  form  of  T.  This  precludes  describing  the 
semantics  of  WI\  using  model  theory  alone.  The  status  of  t  in  the  two  sets  Ti  and  T2  is 
different;  in  Tj  it  is  a  primitive  proposition  (its  justification  set  is  empty),  whereas  in  T2  it 
is  a  derived  proposition  (its  justification  set  is  the  singleton  containing  I).  This  distinction 
between  primitive  and  derived  t’s  is  lost  in  the  model  theoretic  accounts  of  the  meaning 
of  the  two  sets. 

This  example  brings  up  an  interesting  kind  of  non-monotonicity  that  WIi  possesses. 
The  irrelevance  claims  are  not  preserved  across  the  deductive-closure  operation  on  T. 

Example  2  Let  Tj  =  {p,p  =>  q}  and  let  T2  =  {p,p  =>  q,  q}. 

It  is  the  case  that  -,W7j(p,q,Tj)  and  WJi(p,q,T2)  even  though  T2  is  generated  from  Tj 
by  adding  an  entailed  conclusion.  So  irrelevance  claims  of  this  form  are  not  preserved 
across  addition  of  deductively  entailed  conclusions.  However,  note  that  the  conditional 
irrelevance  claim  that  holds  in  both  sets  is  q  6  T  =>  W/!(p,q,T).  This  states  that  if  q 
were  present  in  the  set  T,  then  p  would  be  irrelevant  to  q.  For  T2  the  LHS  is  true  and 
thus  we  get  W^/i(p,q,Tj).  For  Tj,  we  interpret  the  condition,  counterfactually  -  if  q  were 
added  to  Ti,  then  p  would  be  WJ\  to  q. 

Definition  15  does  not  usually  specify  a  unique  T\  If  there  axe  multiple  subsets  of 
T  such  that  they  entail  g  without  entailing  f,  then  we  have  to  choose  which  one  will  be 
constructed  by  our  automated  irrelevance  reasoner. 

Definition  16  WTmox(T,g(Tj  =  T  g  and  3  T\  T*  j=  g  and  T'  ^  f.  where  T'  is  a 
maximal  subset  of  T  with  this  property. 

One  might  wonder  why  we  impose  the  restriction  of  maodmality  on  T'  in  this  defini¬ 
tion.  This  requirement  is  not  essential  for  the  determination  of  irrelevance,  it  is  necessary 
however,  that  the  revision  of  T  that  we  construct  be  the  most  conservative  one,  so  that 
we  do  not  lose  any  more  information  than  we  need  to. 


CHAPTER  3.  THE  THEORY  OF  IRRELEVANCE 


65 


3.6.1  Some  properties  of  WI\ 

WI\  is  intransitive  and  asymmetric  in  the  general  case.  It  is  also  non-monotonic  with 
respect  to  additions  to  T. 

1.  Intransitivity 

WJ,(f,g,T)  A  W7i(g,h,T)  *  WJ,(f,h,T) 

For  example,  let  f,  g,  and  h  stand  for  the  following  propositions, 
f  :  There  is  a  blight  on  the  tea  crop  in  China, 
g  :  I  am  writing  my  thesis, 
h  :  The  price  of  tea  in  China  is  affected. 

The  general  schema  for  generating  intransitive  claims  is  to  take  two  causally  related 
events  and  interpose  an  irrelevant  one  between  them.  This  property  of  WIX  makes 
it  difficult  to  propagate  irrelevance  conclusions  directly. 

2.  Asymmetry 

WVi(f,g,  T)  *  1Vh(g,f,T) 

Let  f  and  g  stand  for  the  following  propositions, 
f  :  Joe  sells  his  stock 
g  :  The  stock  market  goes  down 
T  :  Joe  is  a  very  small  investor 

It  is  true  that  Joe’s  selling  his  stock  has  a  negligible  effect  on  the  market,  but  the 
crash  of  the  market  is  very  relevant  to  Joe’s  selling  actions! 

For  the  special  case  when  f  and  g  are  independent  (that  is,  they  can  be  assigned 
truth  values  in  T  independent  of  each  other)  we  can  infer  the  irrelevance  of  g  to  f 
from  the  irrelevance  of  f  to  g. 

3.  Non-monotonicity 

Suppose  we  have  an  irrelevance  claim  that  is  true  of  a  set  of  sentences.  We  will 
show  that  making  consistent  additions  to  that  set  does  not  necessarily  preserve 
the  irrelevance  claim.  We  will  use  Example  1  defined  earlier.  We  showed  that 
-> W/j(l,  t,  T2).  Now,  add  T  to  T2  to  construct  T3.  Applying  Definition  15  again,  we 
can  establish  that  WIi(p,q,  T3).  This  is  an  interesting  kind  of  non-monotonicity: 
irrelevance  claims  are  not  preserved  under  deductive  closure. 
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3.6.2  Towards  a  proof  theory  of  WIX 

The  irrelevance  claim  WTi(f,g,T)  states  that  either  f  is  not  necessary  for  deriving  g  in  T, 
or  that  there  are  alternative  ways  of  deriving  g  that  do  not  require  f.  The  claim  allows 
us  to  make  statements  about  proofs  of  g  in  T  without  enumerating  them.  The  assertional 
import  of  the  claim  (i.e.  the  inference  it  sanctions)  is  that  it  is  correct  with  respect  to 
preserving  g,  to  simplify  T  to  T  by  the  subset  construction  in  Definition  15. 

Example  3  Let  T  =  {p  A  q  =>•  r,  p  A  q  =>  s,  s  A  t  A  u  =>  r,  p  =►  t,  u,  p,  q} 

Notice  that  there  are  two  ways  of  concluding  r.  The  shorter  proof  uses  p,  q,  and  p  =>  q. 
The  longer  one  uses  s  as  an  intermediate  conclusion.  It  is  correct  to  assert  in  the  meta¬ 
theory  of  T  that  WJi(s,  r,  T).  If  we  act  upon  this  claim,  we  would  simplify  T  to  the  subset 
{p  A  q  =>  r,  sAt  Au  =>  r,  p  =>  t,u,p,q}  and  allow  only  the  shorter  proof  of  r  to  be 
concluded.  The  irrelevance  claim  is  a  control  hint  and  the  inference  it  sanctions  shuts  out 
undesired  proofs  of  the  goal.  Suppose  now  that  we  want  the  longer  proof  of  r  to  succeed. 
In  this  case,  we  would  make  the  rule  p  A  q  =>  r  be  irrelevant  to  r.  The  subset  construction 
in  Definition  lb  does  not  achieve  the  omitting  of  the  rule,  which  would  ensure  that  the 
only  proof  of  r  remaining  would  be  the  one  through  s.  To  do  this,  we  need  the  WI2  notion 
here. 

Definition  17  WI2(f  g,  T)  =  T  f=  g  and  3 T'.  T'  g  where  T'  =  T  -  {/}. 

Example  4  Let  T  =  {p,p  =>  q,q) 

Is  p  irrelevant  to  q?  By  the  subset  definition,  we  can  construct  a  subset  T'  of  T  ,  namely 
{p  =>  q,q}  that  allows  us  to  conclude  q  without  committing  ourselves  on  p.  So  p  is  indeed 
irrelevant  to  q.  Let  ns  consider  the  subsets  of  T  and  write  down  the  irrelevance  claims 
that  would  generate  them. 

{p,p  =>  q}:WJj(q,q,T) 

{p  =>  q,q}:W7i(p,q,T) 

{p,  q}  :W/a(p=>  q,q,T) 

This  points  to  a  limitation  of  WIly  it  cannot  distinguish  between  stored  and  derived 
propositions.  To  generate  the  first  subset  above,  we  needed  to  remove  q  that  was  stored 
and  allow  the  derived  q  to  survive.  WI\  completely  purges  T  of  the  proposition  that  is 
deemed  irrelevant.  So  we  need  the  weaker  version  WI2  which  simply  removes  the  irrelevant 
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proposition  without  removing  those  that  derive  it  and  those  whose  derivation  depends  on 
it.  Intuitively,  what  WI2  accomplishes  is  getting  rid  of  f  if  it  occurs  explicitly  in  T,  i.e., 
it  removes  0-length  proofs  of  f  in  T.  This  cam  be  stated  precisely  as  follows. 

Theorem  5  If  f  is  a  primitive  fact  in  T  (i.e.  Support  (I  ,T )  =  0),  then  WI\(t,  g,  1)  = 
WI2  (f,g,  T )  and  the  T*  that  results  from  both  constructions  are  the  same. 

When  f  does  not  occur  explicitly  in  T,  then  WI2(f,g,  T)  will  be  true  vacuously.  All 
that  W I2  can  do  is  deem  cached  f  s  irrelevant  to  the  computation  of  g.  WI\ ,  on  the  other 
hand  purges  f  completely  from  the  formulation.  So  it  removes  cached  f  s  as  well  as  blocks 
all  possible  derivations  of  f  in  T. 

3.6.3  An  Axiomatization  of  WI2 

The  domain  of  discourse  of  the  logic  includes  propositional  atoms,  well-formed  formulae 
(made  from  connectives),  sets  of  wffs  and  proofs.  The  constants  are  taken  from  the  set 
P  of  propositional  atoms  pi  ,p2, . . .  ,/>„.  The  connectives  are  A  ,  V  ,  ■=>  ,  =  .  The 
well  formed  formulaes  are  defined  inductively.  There  are  also  sets  of  well-formed  formulae 
Si ,  S2, . . . ,  and  sequences  of  well- formed  formulae  (also  called  proofs)  P\ ,  P2, . . .. 

We  now  define  some  of  the  functions. 

Consequences  :  wff  x  Set  of  wffs  —*  Set  of  wffs 
Antecedents  :  wff  x  Set  of  wffs  — ►  Set  of  wffs 
Intermediates  :  wff  x  Set  of  wffs  -♦  Set  of  wffs 
Paths  :  wff  x  Set  of  wffs  — *  Set  of  Sequences  of  wffs 

Informally,  Consequences  finds  the  consequences  of  a  wff  in  a  set  of  wffs,  both  immediate 
and  derived.  Antecedents  finds  the  set  of  support  for  a  wff  in  a  set  of  wffs.  Intermediates 
finds  those  wffs  in  a  set  of  wffs  that  use  a  given  wff  as  an  intermediate  in  their  derivation. 
Paths  of  a  wff  is  a  set  of  proofs  for  that  wff. 

These  functions  are  defined  inductively. 

x  £  Consequences(x,  T). 

If  y  €  Consequences(x,  T)  A  y  =>  z  6  T  then  z  £  Consequences(x,  T). 
x  £  Antecedents^,  T). 

If  z  £  Antecedents(x,  T)  A  y  =>  z  €  T  then  y  6  Antecedents(x,  T). 

I  ntermediates(h.  S)={f,g|f=>*htSAh=>*j£S} 
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=>*  stands  for  transitive  closure  under  =►. 

Theorem  6  Wlj(f,g,T)  is  true  exactly  when  f  g  Antecedents  (g,T )  or  f  is  not  in  every 
element  of  Paths( g,T ). 

The  first  condition  corresponds  to  there  being  no  directed  path  from  f  to  g  in  the  graph 
representation  of  T.  The  second  condition  is  equivalent  to  saying  that  there  is  an  alternate 
path  to  g  that  does  not  go  via  f. 

3.6.4  Lemmas  of  WIX 

1.  W/x(f,g,T)  a  W7j(h,g,T)  =>  Wh(i  A  h,g,T). 

Take  an  example:  Let  T  =  {f,h,f  =>  g,h  =>  g}.  WJ(f,g,T)  because  we  can 
construct  Ti  =  {h,f  =>  g,h  =>  g  }.  Also,  WJi(h,g,T)  because  we  can  construct 
T2  =  {f,f  =>  g,h  ^  g}.  Now  we  ask  whether  W/J(f  A  h,g,T).  Yes,  because  either 
Tx  or  Tj  satisfies  the  requirement  that  neither  f  A  h  nor  its  negation  is  entailed  in 
it. 

2.  Wh(i  A  h,g,T)  =>  WIX(1, g.T)  V  Wh( h,g,T) 

To  see  why  this  is  true:  consider  the  following  example.  Let  T  =  {  f,  h,f  =>  g,  h  => 
g}.  There  are  two  possible  maximal  subsets  of  T  that  can  be  constructed  to  show 
that  W7i(f  A  h,g,T).  One  of  them  is  Ti  from  the  previous  example,  and  the  other 
is  T2-  If  we  only  knew  WJi(f  A  h,g,T),  we  couldn’t  tell  whether  f  was  irrelevant 
or  h  was  irrelevant.  So  we  can  only  conclude  the  disjunction  of  the  individual  WI\ 
claims.  This  implication  seems  to  be  an  iff  one.  Uecause  if  we  can  prove  g  without 
f,  then  we  can  prove  g  without  f  conjoined  with  h,  which  is  stronger  than  f  itself! 

3.  WIX{ f  v  h,g,T)  =>  WIX(1, g,T)  v  VV/i(h,g,T) 

This  can  be  proved  in  the  following  manner: 

WJx(f  v  h,g,T)  =  WIx(-<( f  v  h),g,T) 

=  WIX(^  A  -h,g,T) 

=>  W/,H,g,T)  v  WIX(-, h,g,T) 

=>  VV/jCf.g,!)  V  VY/j(h,g, T) 


4.  W7,(f,g,  Ag2,T)  =>  W7i(f,gl,T)A  VV/,(f,g2,T) 
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5.  f,  T) 

Straightforward  consequence  of  the  definition  of  WI\. 

6.  f,  =f2  =>  [WIX(U, g,T)  =*  WIX( f2,g,T)] 

Follows  directly  from  the  definition  of  WI\. 

7-  g,  =  g2  =>  (W/i(f,gi,T)  =>  W/^f.gj, T)] 

Follows  directly  from  the  definition  of  WIX. 

8.  g  €  T  A  f  ^  g  =>  T) 

If  g  is  in  T,  then  everything  except  itself  is  redundant  to  it. 

9.  S  =  {p  G  T|  f  =>  p  }  A  W7i(ApeS  P>gJ)  =>  Wii(f,g,T)  when  f  g  in  T 

If  all  facts  derivable  from  f  are  redundant  to  g,  then  f  itself  is  redundant  to  g. 

10.  S  =  {p  e  T  |  p=>  g}  A  W^/i(f,Ap€S  P’T)  =*■  W’AfrgT)  when  g  ^  f  in  T 
Dual  of  previous  lemma. 

11.  p€T  A  Derives-Only(f,p,T)  =>  tV/x(f,g,T)  when  T[=g 

If  the  only  role  of  f  is  to  derive  p  which  already  exists  in  T,  then  f  is  redundant  for 
any  goal  other  than  itself. 

Since  detection  of  redundancy  is  undecidable,  it  is  clear  that  we  can  never  have  a 
complete  axiomatization  of  WIX. 

3.6.5  Properties  of  Proofs  in  the  WIX  calculus 

Observation  4  One  benefit  of  proving  redundancy  of  some  facts  in  a  theory,  ts  that  we 
can  optimize  space  requirements  by  simply  removing  the  redundant  facts. 

A  proof  of  redundancy  using  the  lemmas  of  WI  in  the  meta- theory  of  T  captures  an 
important  property  of  the  proofs  of  g  in  M,  without  exhaustively  enumerating  them.  A 
problem  solver  that  can  prove  this  redundancy  claim  at  the  meta-level  can  prune  a  large 
class  of  inferences  this  way.  Also  if  the  redundancy  statements  were  made  available  to 
the  problem  solver,  it  could  compile  it  into  the  base  level  formulation  by  simply  throwing 
away  those  facts  that  cause  the  redundancy.  This  leads  to  savings  in  space.  To  ensure 
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that  savings  in  time  in  proving  g  result  from  this  pruning,  we  need  to  show  that  the  proofs 
of  g  that  fail  as  a  result  are  derivationally  more  complex  than  the  ones  that  are  retained. 
This  is  captured  in  our  definition  of  computational  irrelevance  below.  In  either  case,  the 
deliberation  time  for  the  problem  solver  will  be  reduced  because  the  search  space  of  proofs 
of  g  in  T  is  reduced  by  the  removal  of  redundancy. 

Definition  18  CImax(f,g,T)  =  T  (=  g  and  3  T\  T’  f=  g  and  T'  ft  f.  where  T'  is  a 
maximal  subset  of  T  with  this  property  and  CostfT'  b  g )  <  CostfT  b  g). 

Observation  5  CImax(i,g,  T)  =>•  WImax( f,g,T). 

The  computational  irrelevance  notion  folds  in  a  utility  measure;  it  orders  logically 
irrelevant  statements  according  to  their  computational  impact.  Contrast  this  with  an  ap¬ 
proximate  irrelevance  claim  that  isolates  logically  relevant  but  computationally  prohibitive 
distinctions  made  in  a  formulation. 

3.6.6  Expressive  power  and  limitations  of  WIX 

The  WIX  statements  are  not  reducible;  in  the  sense  that  every  sentence  in  the  extended 
language  is  not  reducible  to  one  in  the  pure  propositional  calculus.  So  WIX  is  more  than  a 
notational  convenience,  it  is  a  genuine  meta-theoretic  notion.  WIX  formalizes  reachability 
properties  in  a  propositional  dependency  network  of  Horn  clauses. 

The  subset  construction  is  a  very  strong  bias  in  the  space  of  possible  abstractions.  The 
following  is  the  simplest  example  of  irrelevance  that  the  subset  definition  does  not  cover. 

Example  5  Let  T  ={~>f  V  g,f  V  g}. 

f  is  irrelevant  to  g  in  T,  because  there  are  models  of  T  that  have  f  and  g  true,  and  f  false 

and  g  true!  Thus  flipping  the  value  of  f  in  T  does  not  cause  that  of  g  to  change. 

\ 

Theorem  7  WIX  provides  an  axiomatic  account  of  a  TMS  style  dependency  analysis. 

1.  Reachability  analysis  on  inference  graph 

The  propositional  logic  of  irrelevance  WIX  does  a  pure  reachability  analysis  on  the 
set  of  sentences  in  T.  There  are  three  cases  of  this  reachability  analysis. 
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(a)  iff  T  -  p  (=  p.  p  occurs  explicitly  in  T  and  p  is  also  derivable  by 

other  means  in  T.  This  happens  when  p  is  cached  and  the  subset  construction 

allows  us  to  get  rid  of  cached  information.  This  is  a  construction  that  reduces 
space  at  the  cost  of  time. 

(b)  W/i(p,q,T)  iff  T  (=  q  and  3T*  c  T  such  that  T'  (=  q  A  T'  ^  p.  This  happens 

when  there  are  two  or  more  proofs  of  q  in  T,  one  that  goes  through  p,  and  one 

that  doesn’t.  The  subset  construction  prunes  out  that  part  of  the  theory  that 
allows  the  proof  through  p  to  succeed. 

(c)  Independent(p,q,  T):  the  case  where  no  proof  of  q  uses  p.  Space  savings  only. 

These  three  cases  take  care  of  optimizations  that  save  space  and  time  without  chang¬ 
ing  the  ontology. 

2.  Collapsing  nodes  in  inference  graph 

The  previous  methods  only  short-circuit  chains  of  reasoning  within  an  inference 
graph  while  preserving  its  structure.  This  operation  allows  us  to  treat  several  nodes 
as  a  unit,  effectively  generating  new  objects  that  stand  for  equivalence  classes  of 
nodes. 

3.6.7  Algorithms  for  computing  WIX 

Here  we  present  two  basic  methods  for  detecting  whether  f  is  irrelevant  to  g  in  the  context 
of  a  theory  T.  Assume  that  f  is  an  atomic  fact,  we  can  break  f  into  its  constituents  by 
the  rules  of  the  irrelevance  calculus  and  then  reduce  all  testing  of  irrelevance  claims  to 
proving  atomic  f’s  irrelevant. 

•  Break  all  ways  of  concluding  f  in  T.  Check  if  g  is  still  deducible.  One  way  of  achieving 
this  in  a  Horn  clause  database  is  to  drop  all  clauses  that  have  f  occurring  positively. 
If  g  is  still  deducible  in  the  resulting  database,  we  conclude  that  f  is  irrelevant  to  g. 

•  Compute  the  support  set  of  g  in  T  (i.e.  find  all  ways  of  proving  g  in  T).  This  is  a 
set  of  justifications  for  g  in  T.  Drop  all  justifications  that  contain  f.  If  the  support 
set  empties  after  this,  then  f  is  not  irrelevant  to  g.  Otherwise  it  is. 

The  proof- theoretic  correlate  of  these  two  procedures  is  immediately  obvious.  Method  1 
foils  all  proofs  of  f  at  the  last  step  (i.e.  all  premises  of  may  be  provable,  but  the  deduction 
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of  f  is  not  possible  because  the  rule  that  concludes  f  from  its  premises  is  removed).  So  if 
a  proof  of  g  succeeds  in  the  new  set  of  clauses,  f  does  not  occur  essentially  in  a  proof  of 
g.  Method  2  computes  all  proofs  of  g  and  collects  the  disjunctive  set  of  justifications.  If  f 
does  not  occur  in  all  them  it  again  implies  that  f  was  not  essential  to  proving  g  in  T. 

The  first  method  above  can  be  expressed  as  the  following  algorithm. 

Algorithm  1  1.  Convert  T  to  clausal  form.  2.  Remove  all  clauses  that  have  a  positive 
occurrence  of  I  in  them.  Call  the  remaining  set  T' .  3.  Check  ifT'  entails  g. 

Since  every  way  of  concluding  f  is  foiled  in  this  construction,  the  resulting  set  T‘  does  not 
entail  f. 

Example  0  T  =  {a,  a  A  b  =>  g} 

Is  a  irrelevant  to  g  in  this  theory?  No,  because  after  we  drop  every  positive  occurrence 
(1  in  this  case)  of  a,  g  is  no  longer  provable.  In  a  sense  what  this  procedure  captures  is 
the  fact  that  if  we  prevent  the  concluding  of  a  in  T  we  would  still  be  able  to  conclude  g. 
Notice  that  this  works  well  in  Horn  databases.  But  in  non-Horn  databases,  this  method 
fails. 

Example  7  T  -  {f  v  g,  -if  V  g}.  We  can  resolve  these  two  clauses  and  get  g.  The  truth 
value  assigned  to  f  does  not  affect  g;  g  remains  true  no  matter  what  value  we  assign  to  f. 

Observation  0  Algorithm  1  is  sound  with  respect  to  the  WI\  definition. 

If  the  algorithm  deems  that  f  is  irrelevant  to  g,  then  so  would  the  definition  of  Irr\ . 

Observation  7  Algorithm  I  is  not  complete  with  respect  to  the  definition  ofWI\. 

This  algorithm  generates  one  subset  of  T  that  does  not  entail  f  and  checks  if  g  is  still 
entailed  by  that  subset.  There  are  many  subsets  of  T  that  do  not  entail  f,  and  if  g  is 
entailed  by  one  that  is  not  computed  by  Algorithm  1,  then  it  will  not  be  detected  by  it. 

Observation  8  Algorithm  1  does  not  compute  maximal  T'  with  the  property  required  by 
the  definition  of  WI\. 

The  clause  -<h  V  /  will  be  removed  even  if  h  is  not  derivable  in  T. 
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3.7  The  first  order  case 

For  propositional  logic,  meta-level  proofs  of  derivability  offer  no  advantage  over  the  base 
level  proofs.  Doing  a  step  in  the  meta-level  corresponds  to  doing  an  inference  step  at  the 
base-level.  For  first-order  logic,  this  is  not  the  case  and  the  real  power  of  reasoning  at  the 
meta-level  is  demonstrated.  The  propositional  definition  of  irrelevance  scales  up  to  the 
first  order  case  as  follows. 

Definition  19  f(x,  z)  is  weakly  irrelevant  to  g(x,  y)  modulo  the  set  of  sentences  T,  written 
as  WI(f(z,  z),g(x,  y),  T ),  if  we  can  construct  the  set  T'  such  that  T'  C  T  and 
[Vzy.  T  (=  g( x ,  y)  =  T'  j=  g(x,  y)  where  [Vz.  T'  f(x,  z)]]  and  T'  is  a  maximal  subset  of 
T  with  this  property. 

We  show  two  examples  of  first-order  irrelevance  claims  from  the  kinship  example.  As 
far  as  SameFamily  is  concerned,  the  distinction  between  immediate  and  non-immediate 
ancestry  is  irrelevant. 


ICl: 


Vxj/mn.  Ancestor(x,  y)  G  T  =>  W/(Father(x,  y),  SameFamily(m,  n),  T) 


Further,  the  identity  of  the  common  ancestor  is  also  irrelevant;  all  that  SameFamily  seeks 
to  establish  is  the  existence  of  a  common  ancestor.  This  can  be  stated  as  follows: 


IC2: 


Vi yzmn.  Ancestor(x,  y)  €  T  A  Ancestor(y,  z)fT  A  Ancestor(x,  z)  6  T  => 
WI(  Ancestor(y,  z),  SameFamily(m,  n),  T) 


These  irrelevance  claims  can  be  verified  using  the  calculus  of  irrelevance  presented 
below.  We  have  built  a  meta-level  irrelevance  reasoner  that  can  prove  these  claims  given 
the  initial  encoding. 

Verification  of  an  irrelevance  ciaim  using  the  semantic  definition  of  irrelevance  takes 
time  exponential  in  the  size  of  T.  To  make  the  verification  process  tractable  we  generate 
lemmas  that  satisfy  the  definition.  These  lemmas  constitute  the  proof  theory  for  WI.  All 
variables  below  axe  assumed  to  be  universally  quantified;  x  and  y  are  vectors  of  variables 
that  are  not  disjoint. 


1.  W/(fI(*)Af2(z),g(y)fT)=>ir/(fl(*),g(y),T)vW/(f2(*),g(y)lT) 

2.  VV/(f(x),gj(y)  Ag2(y),T)  =>  W/(f(x),g,(y),  T)  A  W/(f(x),g2(y),  T) 
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3.  ->WJ(f(x),f(x),T) 

4.  fj(x)  =  f2(x)  =>  [WV(fi(x),g(y),T)  =>  W/(f2(x),g(y),T)] 

5.  gi(y)  =  g2(y)  =>  [WJ(f(x),g,(y),T)  =>  W/(f(x),g2(y),T)] 

6-  g(y)  6  T  Af(x)  ^  g(y)  =>  WJ(f(x),g(y),T) 

7.  5  =  {p(x)  e  T  |  (f(x)  =>  p(x))  6  T}A^/(Ap(l)€Sp(x),g(y),T)  =>  W7(f(x),g(y),T) 

8.  5  =  {P(y)  €  T  |  (p(y)  =>  g(y)  €  T}  A  W/(f(x),  Ap(v)€sP(y),T)  =►  WJ(f(x),g(y),T) 

9.  q(x)  €  T  A  Derives  -  Only(i{x),  q(x),T)  =>  W7(f(x),g(x),T)  where  f(x)  ^  g(x) 

The  lemmas  1-8  are  straightforward  extensions  of  the  propositional  lemmas.  Derives- 
Only  is  a  predicate  that  holds  between  fact  schemas.  Derives-Only  (i(x),q(x),T )  is  true  of 
an  encoding  T  if  the  only  rule  that  has  f(x)  on  its  left-hand-side  concludes  q(x).  Since  rules 
can  be  looked  up  in  constant  time  by  our  problem  solver,  this  check  can  be  accomplished 
very  quickly. 

Theorem  8  Lemmas  1  through  9  are  sound  with  respect  to  the  semantics  presented  in 
Definition  4- 

Theorem  9  Lemmas  1  through  9  are  incomplete  with  respect  to  the  semantics  presented 
in  Definition  4. 

Proof:  To  complete  the  axiomatization  of  WI  would  require  that  we  identify  all  the 
base  cases  for  proving  irrelevance  (redundancy).  Since  detection  of  redundancy  is  semi- 
decidable,  this  task  is  impossible.  □. 

3.7.1  Proving  Irrelevance  Claims 

Suppose  we  wish  to  prove  ICl.  We  look  for  a  lemma  that  matches  the  condition  side  of 
IC1.  One  lemma  that  does  that  is  Lemma  9. 


1.  q(l)  €  T  A  Derives  -  0nly(1(l),q(l),T)  =>  WI{i(l),g(s),T)wherei{l)  *  g(s) 

The  unification  process  yields  the  following  binding  list  {/  =  Father,  q  =  Ancestor,  g  = 
SameFamily,  /  =  (z,y),s  =  (m,n)}.  After  instantiation,  we  obtain 


CHAPTER  3.  THE  THEORY  OF  IRRELEVANCE 


75 


2.  Ancestor(x,  y)  £  T  A  £crives  —  0nIy(Father(x,  y),  Ancestor(x,  y),T)  => 

WI(  Father( x,  y),  SameFamily(m,  n),  r)tnhereFather(x,  y)  ^  SameFamily(m,  n). 

3.  Derives  -  On/y(Father(x,  y),  Ancestor(x,  y),  T)  follows  from  a  lookup  action. 

4.  Father(x,y)  ^  SameFamily(m,  n)  follows  from  deduction  over  the  definability  struc¬ 
tures  introduced  in  Section  4.2.1 

After  the  resolution  of  3  and  4  against  2,  we  have 

5.  Ancestor(x,  y)  6  T  =>  W/(Father(x,  y),  SameFamily(m,  n), T).  □. 

The  claim  above  stales  that  there  are  two  ways  of  establishing  any  SameFamily  query:  those 
that  use  Ancestor(x,y)  alone  and  those  that  use  Father(x,y)  as  the  terminating  point.  A 
problem  solver  that  can  prove  this  claim  at  the  meta-level  has  the  opportunity  to  choose 
between  two  classes  of  paths  through  the  search  space.  If  it  prunes  out  the  Father  relation, 
all  proofs  of  SameFamily  get  shortened  by  1  step  and  the  search  space  is  reduced  because 
of  the  removal  of  nodes  that  result  from  the  expansion  using  the  Father  rule. 

As  in  the  propositional  case,  the  pruning  alters  the  base-level  formulation  so  that  this 
redundancy  does  not  arise.  Removal  of  irrelevant  or  redundant  paths  leads  to  savings 
both  in  space  and  time  especially  if  the  proofs  of  g  that  fail  as  a  result  of  the  irrelevance 
removal  are  the  ones  that  are  more  complex.  More  importantly,  the  deliberation  time 
for  the  problem  solver  is  reduced  because  the  search  space  is  made  smaller  by  the  re¬ 
moval  of  redundant  paths.  This  notion  is  made  precise  in  the  definition  of  computational 
irrelevance. 

Definition  20  f (z,z)  is  computationally  irrelevant  to  g (x,y)  given  T,  written  as  CIff(x, 
if  it  is  the  case  that  WI(I(z,z),g(z,y),T ),  and  the  T'  that  results  from  the  WI  construction 
is  such  that  AvgCostfl  h  g(x,y))  >  AvgCostfT'  h  g (z,y)). 

AvgCost  is  computed  over  a  given  distribution  of  the  goal  queries.  ICl  and  IC2  are 
computational  irrelevance  claims,  because  the  construction  of  T  ’s  dictated  by  them  result 
in  shorter  chains  of  reasoning  and  smaller  search  spaces.  These  Cl  claims  can  be  proven 
with  the  calculus  of  computational  irrelevance,  cost  models  for  the  problem  solver  and 
knowledge  of  the  initial  encoding. 

Before  we  present  a  proof  theory  for  Cl,  notice  that  IC2  would  be  a  valid  WI  claim  if 
we  replaced  the  conclusion  by  W7(Ancejtor(x,  z),  SameFamily(m,  n),  T).  However,  this 


CHAPTER  3.  THE  THEORY  OF  IRRELEVANCE 


76 


claim  does  not  cause  reduction  in  proof  height.  In  general,  we  have  VfgT.  C/(f,g,T)  => 
WI{ f,g,T).  While  W/(f,g,T)  states  that  is  it  correct  to  remove  f  in  the  context  of 
deducing  g,  the  Cl  claim  has  the  additional  import  that  is  worthwhile  to  remove  f. 

Proof  theory  of  Cl 

The  consequence  of  the  above  theorem  is  that  we  can  construct  the  Cl  calculus  by  selecting 
and  modifying  exactly  those  lemmas  from  the  WI  set  that  actually  reduce  cost  of  proving 
g  as  defined  by  the  cost  model.  This  process  is  illustrated  in  the  context  of  adapting 
Lemmas  9  and  6  to  the  Cl  calculus. 

9.  q(x)  g  T  A  Derives  -  Only( f(x),q(x),T)  =>  WJ(f(x),g(x),T)  where  f(x)  ^  g(x) 
The  two  costs  to  compare  are:  height  of  the  proof  of  g(x)  if  f(x)  were  present  versus  the 
height  of  the  proof  of  g(x)  if  q(x)  were  ground  instead.  Since  ail  proofs  that  use  f(x)  have 
to  go  through  the  expansion  via  the  rule  f(x)  =>  q(x),  and  since  every  ground  f  fact  is 
replaced  by  exactly  one  q  fact,  the  discarding  of  f  meets  both  our  time  and  space  criteria. 
Lemma  9  is  thus  a  Cl  lemma  for  our  problem  solver.  Cl  lemmas  tune  the  WI  lemmas  to 
the  idiosyncracies  of  the  given  problem  solver. 

As  another  example,  consider  Lemma  6. 

6-  g (y)  e  T  A  f(x)  ^  g(y)  =>  WJ(f(x),g(y),T) 

The  only  cost  to  consider  here  is  that  of  storing  g(y).  If  |y|  is  less  than  the  space 
bounds  allotted  to  our  problem  solver,  we  would  include  this  as  a  Cl  lemma.  Notice 
that  our  reformulation  cost  measures  do  not  include  the  cost  of  computing  the  relation  g. 
The  time  and  space  bounds  axe  on  the  end  product  of  the  reformulation  and  not  on  the 
computation  of  the  end  product.  This  is  because  we  amortize  the  cost  of  reformulation 
over  all  future  uses  of  the  new  formulation  with  better  computational  properties. 

3.7.2  Information  needed  to  prove  irrelevance 

The  information  needed  to  demonstrate  that  a  certain  class  of  facts  or  distinctions  are 
logically  irrelevant  to  a  specified  class  of  goals  includes 

1.  Information  about  the  given  encoding 

We  need  to  know  what  relations  are  represented  explicitly /implicitly,  the  number  of 
facts  in  the  encoding  that  match  a  given  form,  information  about  the  completeness 
of  the  database  for  a  given  relation,  as  well  as  information  about  the  stability  of  the 
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Weak  Irrelevance 
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Independence 
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x  — »  y  :  x  is  weaker  than  y 


Figure  3.2:  Logics  of  Relevance  and  Irrelevance 

encoding  for  a  given  relation.  As  an  example  of  the  latter,  consider  the  situation 
in  the  kinship  example  where  Father  facts  can  be  retracted  later.  In  this  case,  we 
cannot  destroy  the  distinction  between  Father  and  Ancestor  facts,  because  doing  so 
will  prevent  us  from  integrating  new  Father  facts  as  the  appropriate  Ancestor  facts. 
Definedness  graphs  introduced  in  Chapter  5  maintains  dependencies  between  various 
parts  of  the  encoding  in  a  very  efficient  manner. 

2.  Information  about  the  conceptualization 

Properties  of  the  objects  and  relations  that  are  encoding  independent,  e.g.,  associa¬ 
tivity,  commutativity,  and  symmetry  of  relations. 

3.  Information  about  the  goals 

We  take  the  goals  directly  into  account  in  our  formulation  of  irrelevance.  The  fre¬ 
quency  and  distribution  of  queries  is  am  important  factor  in  the  determination  of 
what  information  can  be  ignored. 

3.8  Related  Areas 

A  formal  study  of  irrelevance  reveals  that  there  is  interesting  substructure  in  the  space  of 
useful  irrelevamce  inferences.  There  is  a  hierarchy  of  irrelevance  logics  that  is  similar  to 
the  hierarchy  of  logics  that  capture  the  notion  of  relevance  [AB75,DR87].  See  Figure  3.8. 
These  logics  are  not  exact  duals  of  each  other;  many  aspects  of  the  relationship  between 
them  remain  to  be  investigated.  The  statement  that  f  is  relevant  to  g  in  T  conveys  the 
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information  that  knowing  f  in  T  restricts  the  space  of  possibilities  for  g.  The  statement 
that  f  is  irrelevant  to  g  in  M  indicates  that  the  value  of  g  is  insensitive  to  the  value  of 
f,  in  that  even  if  f  were  changed  in  T,  g  would  not.  In  the  missionaries  and  cannibals 
puzzle,  the  names  of  the  missionaries  are  irrelevant  to  the  scheduling  of  the  boat  trips. 
This  means  that  even  if  the  names  were  changed3,  the  solution  (which  does  not  refer  to 
the  missionaries  by  name)  would  not.  This  irrelevance  statement  gives  us  the  justification 
to  modify  the  formulation  in  which  missionaries  are  named,  to  the  more  abstract  one  that 
uses  the  cardinality  of  the  set  of  missionaries. 

The  two  notions  of  relevance  and  irrelevance  are  complementary;  one  expresses  a  de¬ 
pendence  between  two  facts  and  the  other  captures  a  one-sided  lack  of  dependence  (g  being 
not  dependent  on  f).  However,  the  normal  use  of  an  irrelevance  statement  is  to  modify 
T  while  preserving  g,  the  normal  use  of  a  relevance  statement  is  to  infer  restrictions  on  g 
given  f:  thus  the  inferences  they  sanction  are  not  duals. 


’u  long  as  they  remain  distinct  from  each  other! 


Chapter  4 

Generating  Reformulations 


4.1  Generation  =  Discovery  -f  reduction 

This  chapter  describes  how  to  use  irrelevance  claims  to  generate  reformulations.  The  pre¬ 
vious  chapter  outlined  the  descriptional  import  of  irrelevance  claims  -  here  we  discuss  their 
assertional  import,  i.e.  the  inferences  that  they  sanction.  We  present  a  framework  for  the 
generation  of  abstraction  reformulations:  first,  the  discovery  of  appropriate  irrelevance 
claims  in  the  meta- theory  of  a  formulation,  and  then  the  reduction  of  the  formulation 
by  inferences  that  minimize  irrelevant  distinctions.  This  method  of  generating  abstrac¬ 
tions  by  minimizing  distinctions  irrelevant  to  a  given  class  of  goals  can  be  treated  as  a 
first-principles  account  of  abstraction.  We  discuss  the  logical  character  of  the  reduction 
and  the  discovery  processes  and  analyze  their  computational  complexity.  We  state  the 
irrelevance  principle  which  advises  minimization  of  distinctions  modulo  a  goal  and  show 
that  it  underlies  the  construction  of  a  large  class  of  abstractions. 


4.2  The  Irrelevance  Principle 


Irrelevance  claims,  both  logical  and  computational,  identify  irrelevancies  and  redundancies 
in  the  formulation  of  a  problem.  For  instance,  in  the  kinship  example  that  we  have  used 
so  far,  the  first  irrelevance  claim 


IvJl: 


Vxymn.  Ancestor(x,  y)  6  T  =>  W/(Father(x,  y),  SameFamily(m,  n),  T) 


expressed  the  fact  that  the  distinction  between  immediate  and  non-immediate  ancestry 
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represented  by  the  relations  Father  and  Ancestor  respectively,  is  irrelevant  to  the  Same- 
Family  relation.  A  reduction  inference  uses  this  irrelevance  claim  to  modify  the  formula¬ 
tion  so  that  it  no  longer  makes  this  irrelevant  distinction. 

One  modification  that  a  redaction  inference  can  perform  is  to  relabel  all  Father  facts  as 
Ancestor  facts,  effectively  erasing  the  distinction  between  the  relations  they  denote.  This 
inference  is  sanctioned  by  the  irrelevance  principle  whose  informal  statement  is:  remov¬ 
ing  distinctions  irrelevant  to  a  goal  schema  leads  to  the  construction  of  computationally 
tractable  theories  for  that  class  of  goals.  Note  that  this  is  a  local  optimisation  principle 
that  justifies  hill-climbing  in  the  space  of  conceptualizations  toward  reducing  distinctions. 
The  global  irrelevance  principle  states  that  the  best  reconceptualization  makes  the  fewest 
distinctions  consistent  with  the  correctness  and  goodness  constraints  given.  That  is,  the 
optimal  formulation  is  the  one  that  doesn’t  make  any  more  distinctions  than  is  logically 
or  computationally  necessary.  We  will  call  both  these  principles  representational  irrele¬ 
vance  principles  since  their  domain  of  discourse  is  conceptualizations.  We  can  conceive  of 
a  similar  set  of  principles  that  talk  about  computations  t  erformed  by  an  encoding.  The 
local  version  of  such  a  principle  advises  hill- climbing  in  the  space  of  encodings  toward 
fewer  computations  that  is  consistent  with  the  correctness  and  goodness  constraints.  The 
global  version  of  this  principle  states  that  the  best  re-encoding  is  one  that  doesn’t  do  any 
more  computation  than  is  necessary  to  achieve  the  goal  within  the  given  space  constraints. 
We  shall  call  these  two  principles  the  local  and  global  computational  irrelevance  principles 
respectively.  The  representational  and  computational  principles  are  akin  to  Snell’s  law  foi 
computational  systems:  they  propose  taking  the  path  of  least  resistance  in  the  space  of 
representations  and  computations. 

The  representational  irrelevance  principle  is  a  variation  of  Quine’s  principle  of  indis- 
cernibility  of  identicals  [Qui63].  It  is  re-echoed  in  Harman’s  [Har86]  principle  of  clutter 
avoidance  and  Lenat’s  notion  of  cognitive  economy  [LHRK79].  Whereas  the  representa¬ 
tional  irrelevance  principle  outlaws  making  unnecessary  distinctions,  the  computational 
irrelevance  principle  outlaws  unneeded  computation.  To  apply  the  computational  irrel¬ 
evance  principle  we  need  to  specify  an  encoding  and  a  problem  solver  to  determine  the 
computations  that  are  performed.  The  calculus  of  computational  irrelevance  is  a  declara¬ 
tive  way  of  specifying  the  distinctions  in  a  formulation  that  give  rise  to  wasted  computation 
in  a  problem  solver. 

We  can  state  the  representational  and  computational  irrelevance  principles  formally. 
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To  do  that  we  shall  introduce  two  definitions. 

Definition  21  A  conceptualization  Ci=(Oi,Pi,TZ\)  makes  fewer  distinctions  than  the 
conceptualization  C‘i=(Oi,T'i,'Ri)  with  respect  to  the  goal  relation  G  ,  written  as  Makes- 
fewer-d\stinctions(C\.Ci,G),  if  Definable(G,Ci)  A  Definable  (G  ,C})  A  Definable-C(C\,Ci). 

That  is,  a  conceptualization  C\  makes  fewer  distinctions  than  C2  in  the  service  of  G  if 
G  is  definable  in  both,  and  if  the  objects,  functions  and  relations  of  Ci  are  constructible 
from  those  of  C2. 

We  wish  to  find  the  formulation  which  makes  the  fewest  distinctions  that  cam  compute  G 
most  efficiently.  To  do  that,  we  need  to  define  a  metric  on  encodings  that  determines  if 
one  encoding  performs  less  computation  than  another  in  the  service  of  the  same  goal. 

Definition  22  Encoding  E\  of  the  conceptualization  C\  computes  G  more  efficiently  than 
encoding  F2  of  if  G  is  definable  in  C\  and  C2,  and  if  the  cost  of  proving  the  encoding 
Gi  of  G  in  E\  is  less  than  that  of  proving  the  encoding  G2  of  G  in  E2  with  respect 
to  the  given  cost  metric  C  on  the  problem  solver  PS.  We  then  say  that  Makes-fewer- 
computations(E\,  E2,  G,  C,  PS ). 

The  local  representational  irrelevance  principle  (LRIP)  can  be  stated  as: 

LRIP:  Makes-  fewer-distinctions{Ci,Cj,  G) 

=>  Ectter-Conceptualization(Ci,Cj,  G) 

A  conceptualization  is  better  than  another  if  it  can  express  G  while  making  fewer 
distinctions.  The  global  representational  irrelevance  principle  states  that  the  best  concep¬ 
tualization  for  G  is  such  that 

GRIP:  For  a  given  goal  G,  select  C,  such  that  ->3 Cj.  Better-Conceptualization{C,,C},G). 

Clearly,  the  C,  that  satisfies  this  constraint  is  one  that  contains  G  alone!  This  is  because  in 
a  world  where  there  are  no  computational  constraints,  the  best  formulation  for  a  problem 
is  one  that  directly  contains  the  answers:  a  giant  lookup  table  for  G.  To  take  compu¬ 
tational  constraints  into  account,  we  modify  the  LRIP  so  that  not  only  does  a  better 
conceptualization  make  fewer  distinctions,  but  that  it  permits  an  encoding  that  makes 
fewer  computations  according  to  some  cost  metric  C  on  a  problem  solver  PS. 

The  modified  local  representational  irrelevance  principle  (LCRIP)  states  that 
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LCRIP. 

VCj  Cj  Makes- fewer-distinctions(Ci,Cj,G) 

A  3 PS.  3 C.  3 Eif.  Encoding(EkyCi) 

A  V£/.  Encoding(Ei,Cj)  A  Makes- f ewer-computations(Ek,  Ei,G,C,  PS) 

=>  Better-C onceptualization(Ci,C ,) 

The  local  computational  irrelevance  principle  (LCIP)  states  that 

LCIP .Makes-  fewer-computations(Ej,Ej,  G,  C,  PS)  Better- Encoding(Ei,  E},  G,  C,  PS) 

An  encoding  is  better  than  another  if  it  performs  fewer  computations  in  the  service  of  G. 
The  global  computational  irrelevance  principle  states  that  the  best  encoding  for  G  with 
respect  to  a  given  problem  solver  and  a  cost  metric  is  such  that 

GCIP:  For  a  given  goal  G,  a  problem  solver  PS  and  cost  metric  C,  select  E,  such  that 
->3 Ej.  Better- Encoding(Ei,  Ej,  G,  C,  PS) 

If  there  are  multiple  Ej’s  with  this  property,  we  choose  one  which  encodes  a  conceptu¬ 
alization  with  the  fewest  distinctions.  This  intuition  is  expressed  in  the  modified  LCIP 
below. 

LRCIP: 

VEi  E}  3 C.  3 PS.  Makes- fewer-compvtations(Ei,Ej,G,C,  PS) 

A  3 Cfc.  Encoding(Ei,Cie) 

A  VCj.  Encoding(Ej,Ci)  A  Makes- fewer-distinctions{Ci,Cj,G) 

=>  Better- Encoding(Ei,  Ej,  G,  C,  PS) 

How  should  the  search  for  a  good  reconceptualization  proceed?  The  irrelevance  prin¬ 
ciples  provide  natural  gradients  in  the  space  of  all  conceptualizations.  One  approach  is  to 
start  with  the  given  conceptualization  and  incrementally  remove  distinctions  to  arrive  at 
one  that  meets  the  given  correctness  and  goodness  constraints.  This  is  the  approach  used 
in  the  thesis.  A  complementary  approach  is  to  begin  with  a  conceptualization  that  has 
G  alone  and  incrementally  add  distinctions  to  meet  the  computational  constraints.  The 
problem  of  determining  which  distinction  to  add  is  significantly  harder  than  the  problem 
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of  determining  which  to  remove,  because  the  number  of  distinctions  that  could  potentially 
be  added  far  exceeds  those  that  can  be  removed.  However,  when  this  condition  is  violated, 
the  second  approach  to  traversing  the  space  of  conceptualizations  is  the  preferable  one. 

The  irrelevance  principles  as  stated  are  normative.  We  can  determine  whether  a  partic¬ 
ular  conceptualization  shift  or  an  encoding  shift  respects  the  irrelevance  principle.  Thus, 
the  conceptual  shift  in  the  kinship  example  that  required  introducing  the  new  relation 
FoundingFather  and  eliminating  Father  for  solving  the  goal  relation  SameFamily, 
obeys  the  LCRIP.  This  is  because  the  new  conceptualization  is  definable  in  terms  of  the 
old  one,  and  thus  makes  fewer  distinctions.  Also,  it  can  be  encoded  to  meet  the  given 
time  and  space  constraints.  The  compression  of  the  transitivity  chains  in  the  encoding  of 
the  same  example  by  introducing  the  new  predicate  symbol  FoundingFather  is  an  encoding 
shift  that  is  better  for  any  non-caching,  deductive  problem- solver  under  any  reasonable 
cost  metric  (proof  heights,  jjze  of  the  search  space).  This  allows  us  to  show  that  the 
encoding  shift  in  the  kinship  example  follows  the  LCIP.  Demonstrating  that  a  particular 
conceptualization  or  encoding  satisfies  GRIP  or  GCIP,  is  extremely  difficult  because  of 
the  enumeration  over  all  possible  conceptualizations  and  encodings  for  G.  In  this  thesis 
we  focus  entirely  on  satisfying  the  local  versions  of  the  two  principles. 

The  representational  and  computational  irrelevance  principles  are  related;  for  some 
problem  solvers  and  for  some  classes  of  encodings,  minimizing  distinctions  while  preserv¬ 
ing  the  goal  actually  leads  to  minimizing  computation.  In  fact,  Cl  claims,  introduced  in 
Chapter  3  capture  exactly  those  cases  where  removing  information  (minimizing  distinc¬ 
tions)  leads  to  removal  of  unneeded  or  redundant  computation. 

4.3  Reduction  Inferences 

What  is  the  role  of  the  irrelevance  principle  in  generating  reformulations?  We  propose 
reduction  inferences  that  operate  on  encodings  of  conceptualizations  and  that  use  irrel¬ 
evance  claims.  The  task  of  the  reduction  inference  is  to  implement  the  minimization  of 
distinctions  suggested  by  the  LRIP  or  the  minimization  of  computation  as  suggested  by 
the  LCIP.  Since  the  irrelevance  claims  themselves  are  in  the  realm  of  encodings  (e.g.,  ICl), 
a  reduction  inference  can  operationally  interpret  a  claim  of  the  form  c  =>  WI(f,  g,  T)  to 
be:  enriching  a  formulation  to  make  the  condition  on  the  left  hand  side  of  the  claim  true, 
allows  the  removal  of  the  facts  in  it  that  are  specified  as  irrelevant.  This  is  the  situation 
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Theory  +  Meta-theoretic  Irrelevance  Claim  IC 
_ _  Reduction  Inference 

New-Theory  where  IC  is  not  true  of  the  meta-theory  of  New-Theory 


Figure  4.1:  The  form  of  a  reduction  inference 

depicted  in  Figure  2.12.  The  move  at  the  meta- theoretic  level  is  toward  a  formulation  in 
which  none  of  the  irrelevance  claims  can  be  derived;  i.e.,  a  formulation  that  doesn’t  make 
any  of  the  distinctions  deemed  irrelevant  to  the  class  of  goals. 

This  section  describes  the  mechanics  of  modifying  a  formulation  using  irrelevance 
claims.  Figure  4.1  describes  the  form  of  a  reduction  inference.  We  introduce  the  relation 
Reduce  that  takes  a  formulation  and  an  irrelevance  claim  and  produces  a  new  formulation 
in  whose  meta-theory  the  irrelevance  claim  does  not  hold. 

Definition  23  An  irrelevance  claim  IC  of  the  form  c  =>  WI(fg,T)  holds  in  the  meta¬ 
theory  of  T,  if  we  can  apply  the  T'  construction  of  definition  15  on  T  augmented  to  make 
c  hold. 

IS  the  T'  constructed  is  equal  to  T1,  we  say  that  IC  does  not  hold  in  the  meta-theory 
of  T ■  The  construction  in  Definition  15  does  not  specify  T'  uniquely,  so  we  constrain 
the  choice  by  requiring  it  to  be  the  largest  subset  of  T  that  satisfies  the  definition. 
Reduce(IC,WI(f,g,T))  =  T'  where  T'  is  the  largest  subset  of  T  that  satisfies  Defini¬ 
tion  15.  Unfortunately,  this  still  does  not  fix  T'  uniquely.  Consider  the  following  example. 

Example  8  Let  T={a,b,a  =>  g,b  =>  j}.  We  can  prove  that  WI(a  A  b,g,T).  If  we 
reduce  T  by  this  claim,  we  have  two  candidate  T'  ’s.  T\  =  {a,  a  =>  g,b  =>  g)  and  T^  = 
{b,b=>  g,a=*  g}. 

The  class  of  T’s  and  WI  claims  for  which  a  unique  T'  can  be  found  is  generalized  by  the 
following  theorem.2 

'It  violates  the  strict  subset  requirement. 

’When  the  7* ’s  ue  not  unique,  the  reducer  makes  a  choice  which  may  later  be  retracted.  This  issue  is 
discussed  in  the  implementation  of  a  reduction  system.  The  search  space  for  reduction  is  a  function  of  the 
cardinality  of  the  possible  T"'»  at  each  point. 
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Theorem  10  Reduce(T,IC)  produces  a  unique  T'  for  sets  T  that  have  no  disjunctions 
and  for  claims  IC=  WI(fg,T)  that  have  non-conjunctive  f’s. 

Proof:  If  T  has  disjunctions,  it  might  allow  multiple  derivations  of  g  that  don’t  involve  /, 
and  this  would  cause  multiple  T*’  s  to  be  generated.  Since  WI(/i  A  f2,g,T)  =>  WI(/i,<7,r) 
V  WI(/2,<7,T),  conjunctive  /’ s  automatically  create  multiple  maximal  subsets  of  T  that 
derive  g  without  entailing  /.  □. 

We  would  like  to  let  reduction  inferences  change  the  formulation  minimally  to  remove 
irrelevant  distinctions.  This  policy  specifies  a  preference  among  the  formulations  that 
result  by  revising  a  given  one  to  eliminate  a  distinction:  it  conservatively  picks  a  theory 
that  is  “closest”  to  the  original  theory.  The  reason  that  this  is  a  computationally  sensible 
policy  to  adopt  is  that  it  minimizes  the  work  done  by  the  reduction  process  itself. 

In  order  to  reduce  a  formulation  with  respect  to  an  irrelevance  claim,  we  apply  the 
construction  of  Definition  15  until  the  irrelevance  claim  no  longer  holds  in  the  meta¬ 
theory  of  the  new  formulation.  When  this  happens,  the  T‘  returned  by  the  construction 
is  identical  to  T. 

Theorem  11  The  reduction  process  with  respect  to  a  claim  IC  terminates  when 
Reduce  ( Tn ,  IC )  =T„ . 

Proof:  When  this  condition  obtains,  the  irrelevance  claim  IC  is  no  longer  true  in  the 
meta-theory  and  reduction  with  respect  to  it  has  to  terminate.  □. 

The  iterative  application  of  Reduce  implements  the  meta-theoretic  movement  to  the 
empty  set  of  irrelevance  claims  depicted  in  Figure  2.12.  The  following  theorems  also  hold. 

Theorem  12  Applying  Reduce  iteratively  with  respect  to  a  given  WI  claim  until  a  fix 
point  of  the  theory  is  reached,  produces  a  new  conceptualization  that  is  better  as  defined 
by  the  LRIP. 

Proof:  Every  reduction  by  a  WI  claim  removes  some  fact  or  a  class  of  facts  from  a  theory. 
The  theory  gets  weaker  since  its  deductive  closure  decreases,  and  the  number  of  models  it 
has  increases.  Each  model  of  the  new  theory  can  be  created  by  dropping  elements  from  a 
model  of  the  old  theory.  We  can  treat  conceptualizations  as  special  models  which  include 
the  Herbrand  base  and  the  named  functions  and  relations  on  that  base.  Thus  the  new 
conceptualization  can  be  defined  in  terms  of  the  old  one.  Hence  the  new  conceptualization 
mokes  fewer  distinctions  and  is  a  better  conceptualization  in  the  LRIP  sense.  □. 
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Theorem  13  Applying  Reduce  iteratively  with  respect  to  a  given  Cl  claim  until  a  fix  point 
is  reached,  produces  a  new  encoding  that  is  better  as  defined  by  the  LCIP. 

Proof:  By  construction,  reduction  by  a  Cl  claim  guarantees  that  the  removal  of  a  fact 
or  a  class  of  facts  from  the  given  encoding  causes  the  goal  to  be  solved  faster  in  the  sense 
defined  by  the  cost-metric  C  for  a  given  problem  solver  PS.  So  the  new  encoding  performs 
fewer  computations  in  the  service  of  the  goal  schema,  and  by  the  LCIP  it  is  a  better 
encoding.  □. 

Depending  on  whether  the  reduction  process  operates  directly  on  the  present  formula¬ 
tion  or  on  a  description  of  it,  we  have  two  modes  of  reduction:  extensional  and  intensional. 

4.3.1  Extensional  Reduction 

Extensional  reduction  performs  direct  surgery  on  the  formulation  as  dictated  by  the  ir¬ 
relevance  claims  and  generates  a  new  formulation  that  obeys  both  the  goodness  and  the 
correctness  constraints.  We  give  a  non-procedural  account  of  the  ExtReduce  action  that 
achieves  the  reduction.  ExtReduce  is  a  special  case  of  Reduce  that  has  the  following  1-0 
behaviour. 

Inputs:  •  A  set  of  sentences  T 

•  A  set  I  of  meta-theoretical  claims  of  irrelevance  in  T 
with  respect  to  goal  schema  g 

Output:  •  A  new  set  of  sentences  T'  in  whose  meta- theory 
no  element  of  I  can  be  derived. 

This  form  of  reduction  is  called  extensional  reduction  because  it  works  with  the  sen¬ 
tences  in  T  and  with  ground  instances  of  the  irrelevance  claims.  For  instance,  if  we 
instantiate  ICl  with  bindings  {z=A,y=B}  and  call  it  IC1(A,B)  and  then  compute  the 
value  of  ExtReduce(T JCl(A,B)):  we  will  find  it  equal  to  be  T  U  {Ancestor(A,B)}  - 
{(Father(A,B)}.  This  constitutes  one  step  of  the  reduction  process  that  relabels  Father 
links  as  Ancestor  links  throughout  the  tree.  We  can  use  ICl  again  to  reduce  T'.  The  pro¬ 
cess  terminates  when  all  Father  facts  are  labelled  as  Ancestor  facts.  Then  ICl  no  longer 
holds  in  the  meta- theory  and  the  construction  in  Definition  15  makes  no  more  changes  to 
the  input  theory. 


Theorem  14  The  extensional  reduction  process  terminates  when  ExtReduce (Tn, IC)  =Tn. 
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Theorem  15  If  the  irrelevance  claim  IC  expressed  in  prenex  normal  form  has  n  quanti¬ 
fiers,  then  the  number  of  extensional  reduction  steps  that  need  to  be  performed  is  bounded 
by  the  product  of  the  domain  sizes  of  the  quantifiers. 

Proof:  By  construction.  ExtReduce  has  to  obtain  all  ground  instances  of  the  claim 

before  it  can  perform  the  reduction,  so  in  the  worst  case  we  get  the  above  upper  bound. 
□  . 

4.3.2  Intensional  Reduction 

The  input-output  specification  of  intensional  reduction  IntReduce  is  as  follows: 

Inputs:  •  A  description  of  a  set  of  sentences  T 

•  A  set  I  of  meta-theoretical  claims  of  irrelevance  in  T 
with  respect  to  the  goal  schema  g 

Output:  •  A  new  description  of  a  set  of  sentences  T'  in  whose  meta- theory 
no  element  of  I  can  be  derived. 

One  way  to  conceptualize  intensional  reduction  is  to  think  of  it  as  limit  reasoning  [Wel86] 
over  extensional  reduction.  Intensional  reduction  takes  a  description  of  the  formulation  as 
well  as  the  irrelevance  claims  and  produces  a  description  of  the  reduced  formulation  that 
results  after  the  irrelevance  minimization  process  terminates.  We  present  an  example  to 
clarify  this  idea. 

Suppose  that  all  the  Father  facts  in  Figure  1.3  have  already  been  relabelled  as  Ancestor 
facts.  Now  we  want  to  minimize  the  formulation  using  IC2.  While  the  extensional  re¬ 
duction  process  laboriously  rewires  the  family  tree  by  applying  ExtReduce  over  and  over, 
intensional  reduction  attempts  to  find  a  description  of  the  facts  left  at  the  end  of  the 
extensional  reduction.  A  description  of  the  Ancestor  facts  at  the  start  is: 

Vu.  T  )=  Ancestor(root,u). 

We  wish  to  find  the  description  of  T‘  such  that  Reduce(T>,IC2)  =  T' .  Ancestor(root ,  u) 
is  shorthand  for  the  formula  Ancestor(z,u)  A  -'3m.  Ancestor( m,z). 

The  reasoning  proceeds  in  two  steps. 

1.  IC2  preserves  connectedness  to  the  root. 

Vu.  T  Ancestor(root,  u)  Reduce(T,  IC2)  Ancestor(root,  u) 
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2.  When  the  reduction  terminates  IC2  can  no  longer  be  applied. 

By  examining  the  preconditions  of  IC2,  we  see  that  the  required  condition  is  that 
all  Ancestor  links  be  one  step  long;  i.e.,  ->3xyz.  Ancestor(x,  y)  A  Ancestor (y,z). 

Putting  1  and  2  together,  we  obtain  the  fact  that  all  Ancestor  links  are  connected  to  the 
root  and  that  they  are  all  one  step  long.  That  is,  the  remaining  Ancestor  facts  satisfy: 
Ancestor(root,  y)  which  is  an  abbreviation  for  Ancestor(x,  y)  A  -i3m.Ancestor(m,  i)  which 
is  the  required  definition  of  the  Founding-Father  relation! 

Mathematical  induction  was  required  here  because  of  the  recursion  on  Ancestor.  Note 
that  the  description  of  the  initial  formulation  was  used  directly  as  the  invariant  in  the 
induction  proof.  The  termination  condition  was  obtained  by  negating  the  left  hand  side 
of  IC2. 


4.3.3  Properties  of  Reduction 
The  form  of  the  base  level  theory 

How  sensitive  are  the  irrelevance  claims  to  the  actual  encoding?  Suppose  we  had  instead 
the  following  encoding  of  the  kinship  problem. 

Father(x,y)  =>  Ancestor(x,  y) 

Father(x,  y)  A  Ancestor(y,  z)  =>  Ancestor(x,  z) 

Ancestor(z,  x)  A  Ancestor(z,  y)  =>  SameFamily(x,  y) 

The  irrelevance  claims  axe  the  same;  the  reduction  procedure  has  to  be  sensitive  to 
the  particular  encoding.  We  now  show  how  Reduce  transforms  the  formulation  using  IC1. 
It  is  clear  that  all  ground  Father  facts  will  get  replaced  by  ground  Ancestor  facts.  The 
inference  sanctioned  by  ICl  is  the  replacement  of  the  schema  Father(x,y)  by  the  schema 
Ancestor(x,  y)  in  the  theory.  We  do  the  replacement  throughout  the  formulation  and  ob¬ 
tain  the  rules  Ancestor(x,  y)  =>  Ancestor(x,  z)  A  Ancestor(z,  y) 

Domain  of  discourse  of  the  irrelevance  claim 

The  complexity  of  the  reduction  inferences  is  a  function  of  the  domain  of  discourse  of 
the  irrelevance  claims.  For  the  kinship  example,  the  irrelevance  claims  were  on  particular 
classes  of  facts,  already  in  the  formulation.  The  complexity  arose  from  attempting  to 
find  an  intensional  characterization  of  the  reduction.  For  the  Fibonacci  transformation, 
the  irrelevance  claims  refer  to  repeated  computations,  and  the  reduction  process  has  to 
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determine  the  changes  to  the  formulation  that  ensure  that  those  computations  are  not 
redone. 

Order  of  Reduction 

When  there  is  more  than  one  irrelevance  claim,  there  is  a  non-deterministic  choice  of 
the  order  of  reduction.  One  natural  question  to  ask  is  whether  the  order  of  reduction  of 
irrelevance  claims  makes  a  difference  to  the  end  result;  i.e.  whether  reductions  satisfy  the 
Church-Rosser  property.  The  answer  is  that  in  the  cases  where  it  is  meaningful  to  reduce 
in  any  order,  reductions  do  not  necessarily  obey  the  Church  Rosser  property.  When  the 
claims  are  independent,  then  the  order  is  irrelevant.  In  our  example,  if  we  start  with  a 
database  containing  Father  facts  alone,  the  analysis  of  the  preconditions  of  ICl  and  IC2 
show  that  ICl  needs  to  be  applied  before  IC2.  So  there  is  only  one  order  of  reduction. 
Suppose  for  instance  that  the  effect  of  an  irrelevance  claim  ICS  is  to  take  the  transitive 
reduction  of  the  database  and  the  effect  of  claim  IC4  is  to  take  the  transitive  closure  of  a 
database,  it  is  clear  that  the  order  of  reduction  makes  a  significant  difference  to  the  end 
result. 

Termination  of  Reduction 

How  can  we  know  that  the  reduction  process  terminates?  We  can  reason  that  the  reduc¬ 
tion  process  terminates  on  the  kinship  example  by  using  the  fact  that  the  databases  are 
finite  and  that  the  relations  are  acyclic.  Suppose  we  attempt  to  reduce  a  graph  denoting 
paths  between  adjacent  rooms  as  in  Figure  4.3  with  IC2.  The  reduction  process  will 
not  terminate:  every  possible  cycle  on  three  nodes  will  be  generated.  Standard  methods 
of  termination  reasoning[Man74]  can  be  used  to  determine  non-termination  in  this  case. 
However,  the  fact  that  termination  reduces  to  the  halting  problem  poses  fundamental 
limits  on  inferring  termination. 

4.3.4  Another  Formulation  of  Irrelevance  Minimization 

Here  we  formulate  the  problem  of  minimizing  irrelevant  distinctions  in  a  formulation  in 

terms  of  minimizing  computational  entities  like  proofs  and  search  spaces.  The  predicate 

Proof(p,g,T)  represents  the  fact  that  p  is  a  proof  of  the  goal  g  within  the  encoding  T. 

If  g  is  a  goal  that  is  defined  by  a  wff  with  a  prefix  existential  quantifier  3,  then  we  can 

define  a  relation  Witness  that  extracts  the  value  of  the  existentially  quantified  variable 

'(e  g.,  SameFamily(z.y)  is  defined  to  be  true  if  there  exists  a  z  such  that  z  is  the  ancestor  of  both  * 
and  y) 
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in  a  proof  of  g. 

For  the  kinship  example,  we  have  the  following  property  of  the  proofs  of  SameFamily. 

Vi  y  Proof  (p,  SameFamily(z,  y),  Ei)  A  Witness(p,  z)  A  E\  \=  Ancestor(z,  z')  =>• 
3p'  z  Proof  (p  ,  SameFamily(i,  y),  Ei)  A  Witness(p,  z') 

We  can  implement  the  LCIP  by  minimizing  the  number  of  proofs  that  a  goal  schema 
has  in  a  given  encoding.  The  proliferation  of  proofs  in  E\  is  caused  by  the  the  transitivity 
of  the  Ancestor  relation,  and  the  statement  above  expresses  it.  If  we  allow  a  proof  of 
SameFamily(i,  y)  that  uses  a  witness  z,  we  will  have  to  accept  all  other  proofs  of  the 
same  fact  that  cite  the  ancestors  of  z  as  witnesses.  To  minimize  the  number  of  proofs  for 
each  SameFamily  fact,  we  simply  pick  the  “highest”  proof  and  modify  the  formulation  to 
eliminate  the  smaller  proofs  that  cite  lower  ancestors  as  witnesses.  This  transformation  in 
proof  space  translates  to  introducing  the  FoundingFather  relation  in  the  formulation  space. 
In  general,  this  minimization  process  introduces  new  terms  that  stand  for  macro-objects 
in  the  formulation  space  and  macro-actions  in  the  search  space. 

Whereas  traditional  circumscription  minimizes  the  extents  of  predicates  and  objects 
in  a  theory  to  construct  minimal  models  of  the  theory,  we  minimize  the  objects,  and 
extents  of  predicates  in  order  to  minimize  computational  entities  like  space,  intermediate 
computation  etc.,  that  result  in  solving  a  goal  schema  within  that  theory. 


4.4  The  discovery  of  irrelevance  claims 

4.4.1  By  being  told 

This  is  the  easiest  way  of  acquiring  them.  As  we  have  shown  earlier,  incorporating  irrele¬ 
vance  claims  into  a  theory  is  a  non- trivial  endeavour.  In  general,  the  stronger  the  impact 
of  the  reduction,  the  harder  it  is  to  discover  them  automatically.  The  irrelevance  claim 
however,  can  be  verified  by  the  irrelevance  reasoner. 

4.4.2  By  derivation  in  the  meta-theory 

We  use  the  lemmas  of  WI  and  Cl  to  suggest  possible  irrelevance  claims.  These  lemmas 
have  the  form:  Condition  on  formulation  =>  WI(fact-schema, goal-schema, formulation). 
We  “backward  chain”  on  these  lemmas  by  setting  up  the  goal  of  proving  a  certain  fact- 
schema  irrelevant  to  the  given  goal-schema.  We  illustrate  this  by  an  example: 
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•+2) 


Figure  4.2:  Replacing  a  subtree  by  a  node  in  the  proof  of  SameFamily 
Take  WI  lemma  9. 

1.  q (/)  €  T  A  Derives -On/y(f(/),q(/),T)  =>  WJ(f(/),g(s),T)  where  f (l)  £  g(s) 
and  resolve  it  against  the  fact  that 

2.  £)erives-0n/y(Father(x,  y),  Ancestor(z,  y))  which  is  true  of  the  formulation. 

This  resolution  generates  the  following  binding  list  {f  =  Father,  /  =  (x,y),s  =  (m,n)}. 
Given  that  the  goal  is  SameFamily(a,  b),  we  now  verify  that  Father(x,  y)  SameFamily(a,  b) 
which  is  easy  given  the  definedness  graphs  that  are  introduced  in  Chapter  5  and  obtain 

3.  Ancestor(x,  y)  6  T  =>  W7(Father(x,y),SameFamily(m,  n),T)  which  is  the  required 
claim. 

Goal-directed  derivation  of  irrelevance  claims  in  the  meta-theory  is  only  as  good  as  the 
lemmas  in  the  calculus  of  WI  and  Cl.  The  ultimate  source  of  irrelevance  claims  is  in 
empirical  analyses  of  computational  structures  as  well  as  statistical  correlations  between 
attributes  in  the  world. 

4.4.3  By  empirical  analyses  of  proof  and  search  spaces 

We  can  use  the  declarative  specification  of  the  problem  solver  to  construct  the  symbolic 
computation  trace  for  the  goal  g.  In  our  example,  all  SameFamily  proofs  end  in  Father  facts. 
The  fringe  of  all  these  proof  trees  can  be  moved  up  one  step  closer  to  the  root  without 
increasing  the  width  of  the  tree,  if  the  proofs  are  made  to  terminate  in  the  corresponding 
Ancestor  facts.  This  transformation  of  the  proof  tree  can  be  accomplished  by  relabelling 
the  Father  tree  as  the  Ancestor  tree  and  by  getting  rid  of  the  Father  rule.  The  proof  tree 
transformation  is  a  schema  for  reconfiguring  proofs  and  the  irrelevance  claim  expresses 
what  the  corresponding  formulation  transformation  should  be. 

The  second  irrelevance  claim  results  from  another  general  proof  tree  transformation  that 
attempts  to  reduce  the  height  of  a  proof.  However,  this  requires  that  we  examine  a 
portion  of  the  proof  tree  shown  in  Figure  4.2.  We  essentially  try  to  replace  a  section  of 
the  tree  by  a  single  node  by  the  elimination  of  intermediate  variables.  In  this  example, 
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there  are  two  ways  of  achieving  it:  getting  rid  of  the  Ancestor^, x<+i)  branch,  or  the 
Ancestor(x,+i,  x.+j)  branch.  The  correctness  constraint  filters  out  the  incorrect  branch  to 
prune,  and  we  thus  derive  IC2.  Proof  transformation-schema  2  is  a  very  general  schema 
for  reconfiguring  proofs  because  it  also  underlies  the  construction  of  Thevenin  equivalents 
of  an  analog  circuit. 

For  the  particular  problem  solver  we  are  considering,  short  proofs  are  preferable  to  long 
ones.  Therefore,  transformations  that  influence  the  height  of  the  proof  like  the  ones  above 
are  very  useful  for  reformulations  that  improve  computational  efficiency  with  respect  to 
this  problem  solver.  In  order  to  cover  more  general  classes  of  proof  transformations  and 
search  space  transformations  (e..g.,  eliminate  dead  ends),  a  language  for  expressing  and 
manipulating  proof  transformations  is  needed. 


4.5  Reformulations  Justified  by  the  Irrelevance  Principle 

The  kinship  example  of  this  thesis  is  a  reformulation  of  an  equivalence  relation  as  a  par¬ 
tition;  a  standard  transformation  taught  to  every  computer  scientist.  The  method  of 
irrelevance  minimization  explains  how  this  transformation  can  be  derived  from  more  gen¬ 
eral  considerations. 

This  section  demonstrates  that  a  large  number  of  abstractions  and  optimisation  meth¬ 
ods  can  be  analyzed  using  the  theory  of  irrelevance.  For  each  example,  we  write  the 
meta-theoretic  irrelevance  claims  that  justify  the  abstraction  and  demonstrate  the  re¬ 
duction  methods  used  minimize  irrelevant  distinctions.  The  examples  we  consider  are: 
variants  of  the  kinship  example,  Thevenin  equivalents  and  tail- recursion  optimizations 
(e.g.  Fibonacci). 

4.5.1  Variants  of  the  Kinship  Example 

Notice  that  if  the  goal  relation  in  the  kinship  example  had  been  the  Common  Ancestor 
relation,  which  is  true  of  a  triple  (z,y,z)  just  when  z  is  the  common  ancestor  of  z  and 
y,  then  no  information  in  the  initial  formulation  can  be  proven  to  be  irrelevant  to  this 
goal  So  the  compression  of  ancestor  links  cannot  be  done.  This  demonstrates  that  the 
abstraction  method  is  sensitive  to  the  goals. 

Now  we  will  show  how  some  more  variations  of  the  kinship  example  can  be  analyzed 
using  irrelevance.  Suppose  we  have  the  following  formulation  of  connectivity  between 
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B 


A  B 

Figure  4.3:  A  case  in  which  extensionai  reduction  does  not  terminate 


rooms.  The  relation  DConn  is  a  list  of  pairs  of  rooms  that  are  directly  connected.  Conn 
is  the  transitive  closure  of  Dconn.  The  goal  relation  is  Samecluster:  two  rooms  belong 
to  the  same  cluster  if  they  are  connected  to  a  common  room.  This  problem  is  an  ablation 
of  the  Founding  Fathers  example:  the  Father  relation  is  not  symmetric,  whereas  Dconn 
is.  An  encoding  for  the  SameCluster  problem  is  given  below. 

DConn(x,y)  =>  Conn(x,y) 

Conn(x,  y)  A  Conn(y,  z)  =>  Conn(z,  z) 

Conn(r,  x)  A  Conn(x,  y)  =>  SameCluster(z,  y) 

Conn(x,y)  =>  Conn(y,z) 


The  irrelevance  claims  that  are  true  in  the  meta-theory  of  this  encoding  axe: 


ICS 


Vxymn.  Conn(z,y)  e  T  =>  W7(DConn(x,y),SameCluster(m,n),r) 


IC4: 


Vzyzmn.  Conn(x,  y)  6  T  A  Conn(y,  z)  £  T  A  Conn(x,  z)  €  T  => 
W/(Conn(y,  z),  SameCluster(m,  n),  T) 


'izyzmn.  Conn(x,  y)  €  T  A  Conn(y,  z)  6  T  A  Conn(x,  z)  £  T  => 

IC5: 

Wr/(Conn(x,  y),SameCluster(m,n),r) 

If  we  applied  extensionai  reduction  to  the  encoding  with  the  claims  ICS  through  ICS, 
the  reduction  will  not  terminate.  The  condition  on  the  left-hand  side  of  IC4  and  IC5 
will  never  be  false  in  the  original  encoding  or  its  reductions.  The  lack  of  directionality  in 
the  DConn  facts  makes  the  character  of  this  problem  very  different  from  the  SameFamily 
problem.  This  is  evident  in  the  following  statement  in  the  space  of  proofs.  E  is  the  above 
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a 


Father(a,b) 

Father(a,c) 

Father(b,d) 

Father(b,e) 

Father(c,f) 

Father(c,g) 

Sameramily(x^c) 

Father(x,y)  =>  SameFamily(x,y) 

Father(y,x)  =>  SameFamily(x,y) 

SameFamily(ri,22)  A  Father(zi,x)  A  Father(z2,y)  =>  SameFamily(x,y) 


Figure  4.4:  The  encoding  Es  of  C\ 


encoding. 

Vx  y  Proof{p,SamtQ\\iS\tT{x,y),E)  A  Witness(p,z)  A 
E  f=  SameCluster(x,  z)  =>  3  p  z  Proof  (p  ,  SameClu$ter(x,y),£J)  A  Witness(p  ,z') 

A  graph- theoretic  interpretation  of  the  reduction  is  informative.  Minimizing  the 
heights  of  all  proofs  of  SameCluster(z,y)  requires  that  we  minimize  the  sum  of  the  length 
of  paths  from  node  x  to  y  for  every  x  and  y  in  the  graph.  This  value  is  minimized  when 
all  nodes  point  to  a  common  node.  We  can  do  this  either  by  picking  a  particular  node 
in  the  graph,  or  by  introducing  a  new  object  to  stand  for  a  cluster  representative.  In 
the  SameFamily  problem,  the  fact  that  Father  was  directed  provided  us  with  a  canonical 
representative  of  the  equivalence  relation  SameFamily. 

What  if  the  kinship  problem  had  been  formulated  entirely  in  terms  of  the  Father 
relation?  An  encoding  that  achieves  this  is  shown  in  Figure  4.4. 

The  irrelevance  claims  ICl  and  IC2  do  not  hold  directly  in  the  formulation;  they 
hold  counterfactually.  This  is  because,  this  formulation  has  no  explicit  redundancy.  The 
computation  of  the  SameFamily  relation  is  done  by  finding  the  least  common  ancestor 
of  two  people.  In  order  to  reformulate  this  encoding  to  meet  the  computational  constraint 
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Figure  4.5:  Compiling  out  factored  knowledge  bases 

of  solving  for  Samefamily  queries  in  constant  time  with  O(n)  overhead  in  space  (n=the 
number  of  people),  we  have  to  introduce  the  redundant  relation  Ancestor.  If  we  interpreted 
ICl,  counterfactually,  we  would  relabel  the  Father  relation  as  the  Ancestor  relation.  If 
IC2  were  treated  the  same  way,  we  would  add  the  Ancestor  links  on  the  left-hand  side 
of  the  claim,  in  order  to  render  Ancestor(y,  z)  irrelevant.  This  would  have  the  effect, 
atleast  extensionally,  of  generating  the  maximal  ancestor  links.  The  intensional  reduction 
inferences  are  non-trivial  because  the  system  will  have  to  infer  the  transitivity  of  Ancestor 
that  is  hidden  in  the  claim  IC2. 

4.5.2  Irrelevance  and  the  factoring  of  knowledge  bases 
Definition  24  A  theory  T  is  factorable  if  and  only  if  BTiTj . .  .T„  such  that 


•  ui  =  rr,  =  t 

•  T  -T,£  Ti 
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Figure  4.5  shows  a  set  of  sentences  T  about  the  colors  and  shapes  of  some  blocks.  Since 
the  relations  Shape  and  Color  are  independent  in  the  world,  we  can  factor  T  into  T\  and 
Ti  as  shown  in  the  figure.  We  can  derive  this  factoring  as  a  reduction  via  an  irrelevance 
claim  that  expresses  the  independence  of  Shape  and  Color. 

Vx,  y,  m  WI(Shape(x,  y),Color(x,n),T)  A  WI(Color(x,n),Shape(x ,  y),T) 

This  irrelevance  claim  is  true  in  the  meta-theory  of  T.  To  reduce  T  to  make  the 
irrelevance  claim  false,  we  factor  T  into  T\  in  whose  meta-theory  the  first  conjunct  of 
the  above  irrelevance  claim  is  false,  and  T2  in  whose  meta- theory  the  second  conjunct  of 
the  claim  is  false.  This  reduction  inference  did  not  introduce  any  new  terms.  It  is  a  pure 
efficiency  transformation  -  the  new  theories  are  indexed  in  a  manner  that  allows  a  forward 
chaining  problem  solver  to  answer  questions  about  the  color  of  blocks  (resp.  about  shapes) 
without  deriving  irrelevant  intermediate  facts  about  their  shapes  (resp.  about  colors). 

It  is  useful  to  look  at  this  transformation  from  an  irrelevance  perspective.  Reorganizing 
a  set  of  facts  in  a  theory  in  a  non-random  way  to  make  the  solution  of  a  given  set  of  queries 
more  efficient  takes  work.  There  is  work  involved  in 

1.  Acquiring  the  appropriate  claims  of  irrelevance.  In  this  case,  these  claims  about  the 
independence  of  relations. 

2.  Reducing  the  formulation  to  eliminate  irrelevance.  In  this  case,  the  independence 
claims  could  be  incorporated  into  the  theory  by  factoring. 

The  factored  formulation  as  well  as  the  original  formulation  have  the  same  information 
content  at  the  base  level;  which  is  reflected  by  the  fact  that  they  are  equivalent  at  the 
model- theoretic  level.  However,  at  the  meta-level,  different  irrelevance  claims  hold  of 
them;  and  this  is  reflected  in  the  differences  in  their  structural  properties.  The  following 
theorem  demonstrates  the  fact  that  the  re-indexing  of  a  theory  accomplished  by  factoring 
has  computational  utility. 

Theorem  18  Factoring  of  theories  can  be  justified  by  independence  claims  of  the  form 
WI(p,q,T)  A  WI(q,p,T).  Factoring  T  into  T\  and  T?  that  satisfies  Definition  24  achieves 
the  reduction  of  irrelevance. 

To  make  the  claims  in  this  section  more  rigorous,  we  need  to  set  up  a  more  detailed 
model  of  the  problem  solver  and  provide  the  distribution  of  queries  with  respect  to  which 
the  irrelevance  or  independence  claims  are  determined.  Then  we  can  demonstrate  that 
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the  re-encoded  theory  performs  fewer  computations  (or  database  lookups)  to  solve  the 
given  set  of  queries.  We  can  then  propose  quantitative  measures  of  ease  of  re-organization 
or  re-indexing  of  a  theory  in  terms  of  the  complexity  of  the  discovering  the  necessary 
irrelevance  claims  and  reducing  the  theory  with  respect  to  them. 

4.5.3  Applications  of  LCIP 

The  computational  irrelevance  principle  states  that  an  encoding  that  does  fewer  compu¬ 
tations  to  achieve  a  certain  goal  is  better.  This  is  a  minimization  principle  that  governs 
the  way  we  design  representations  and  optimise  computations  on  those  representations. 

If  we  can  parameterize  a  computation  we  can  find  the  values  at  which  work  is  minimized 
given  some  cost  model  of  our  problem  solver.  There  are  three  basic  ways  in  which  we  can 
eliminate  computation  in  the  service  of  a  goal. 

1.  Eliminate  computations  done  by  an  encoding  that  do  not  contribute  to  the  goal.  An 
example  is  the  following 
Parent(x,y)  =>  Ancestor(x,  y) 

Parent(x,  z )  A  Ancestor(z,  y)  =>  Ancestor(x,  y) 

Suppose  the  query  is:  Ancestor(John,y).  Suppose  further  that  we  have  a  forward 
chaining  problem  solver  that  computes  all  ancestor  facts  from  the  given  parent  facts 
and  then  projects  out  the  ancestors  of  John.  Rewriting  the  encoding  in  the  manner 
given  below  ensures  that  only  the  relevant  ancestors  are  computed.  It  involves 
introducing  a  new  predicate  magic-ancestor  of  arity  one.  The  method  of  rewriting  is 
the  magic  set  method  [BR87]. 

magic  -  ancestor(  John) 

magic  -  ancestor(x)  A  Parent(x,y)  =>  magic  -  ancestor(z) 

magic  -  ancestor(x)AParent(x,  z)Amagic  -  ancestor(z)AAncestor(z,  y)  =>  Ancestor(x,  y) 

It  can  be  shown  that  the  magic  set  method  produces  encodings  that  perform  fewer 
unneeded  computations  to  solve  a  goal  schema.  Reformulations  generated  by  this 
method  thus  implement  the  intent  of  the  LdP. 

This  essentially  precomputes  the  rule  set  so  that  the  parts  of  the  search  space  that  do 
not  contribute  to  the  goal  are  removed  at  compile  time.  Some  relations  are  pruned 
as  a  result.  There  is  computational  savings  obtained;  because  useless  computation 
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Figure  4.6:  Restructuring  a  part  of  the  Fibonacci  computation 

is  avoided;  also  in  a  system  that  has  to  choose  between  alternate  inference  steps 
valuable  decision  time  at  the  meta-level  has  been  saved. 

2.  Eliminating  repeated  computation 

This  is  done  by  reifying  that  computation;  introducing  a  new  object  that  stands  for 
the  value  of  that  computation  and  looking  up  the  value  as  opposed  to  storing  it. 
There  are  two  cases  to  analyse  here:  repeating  computation  while  solving  a  single 
goal  as  in  the  computation  of  Fibonacci  or  Factorial,  and  repeating  computation 
over  several  instances  of  solving  for  a  set  of  related  goals,  an  instance  is  common 
subexpression  elimination.  The  standard  formulation  of  Fibonacci 
Fib(0, 1). 

Fib(l,  1). 

Fib(n  -  1,  mi)  A  Fib(n  -  2,  m2)  A  m  =  mi  +  mj  ^  Fib(n,  m) 

run  on  a  depth-first  backward  chainer  produces  the  computation  trace  shown  in 
Figure  4.6.  We  cam  show  in  the  meta-theory  of  this  encoding  that  repeated  cadis  to 
Fib(m,  -)  in  the  computation  of  Fib(n,  -)  where  m  <  n  occurs  Fib(n-m)  times  (ex¬ 
cept  for  Fib(0,  -)  which  occurs  Fib(n-2)  times).  We  wish  to  devise  an  encoding  that 
does  not  perform  repeated  computation:  in  effect,  an  encoding  that  generates  a  trace 
as  in  Figure  4.6  when  executed  by  a  depth-first  backward  chainer.  The  transforma¬ 
tion  at  the  level  of  proof  trees  is  the  conversion  of  a  computation  tree  into  a  directed 
acyclic  graph  (DAG)  that  has  common  nodes  merged.  The  reformulation  problem 
is  to  find  the  corresponding  change  in  encoding  that  causes  this  transformation  in 
proof  space.  This  situation  is  depicted  in  Figure  4.7. 

Given 
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Search  Space  of _ Restructuring  New 

Formulation  "  Search-  Space 


Formulation  1 


Reformulation 

- ►Formulation  2 


Figure  4.7:  Redesigning  formulations  by  restructuring  search  spaces 

•  A  problem  solver  PS  that  takes  an  encoding  E  of  a  problem  and  a  goal  schema 
G  and  generates  a  proof  P  for  it.  PS(E,G)  =  P 

•  Given  a  standard  transformation  of  P  to  P'  that  avoids  repeated  computation. 
Find  an  encoding  E'  such  that  PS(£'',G)  =  P' . 

A  solution  to  this  requires  that  we  able  to  correlate  changes  in  proof  space  with 
changes  in  formulation  space.  One  approach  simply  maintains  a  table  of  proof  space 
changes  and  the  encoding  changes  that  implement  them.  For  instance,  to  merge 
identical  nodes  in  computation  space,  we  simply  create  a  new  object  to  store  that 
value.  In  the  Fibonacci  case,  adding  an  extra  argument  to  the  Fib  function  achieves 
this: 

Fib(n,a,6)  ■<=  ifn  =  OthenaelseFib(r»  -  l,a  +  6,  a) 

To  find  the  fibonacci  of  n,  we  use  the  query  Fib(n,  1, 1). 

Another  approach  requires  us  to  have  the  inverse  function  PS-1  that  maps  proofs 
back  to  encodings.  Then  the  corresponding  encoding  change  can  be  deduced  from  the 
change  in  proofs.  Unfortunately,  this  idea  cannot  be  implemented  at  the  present  be¬ 
cause  we  lack  good  descriptions  of  problem-solvers.  When  these  descriptions  become 
available,  this  method  will  be  a  general-purpose  scheme  for  deducing  new  encodings 
driven  by  computational  constraints. 

The  changes  in  proof  space  can  be  captured  indirectly  by  the  following  irrelevance 
claim  (from  Chapter  3). 
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IC3: 


Vxn.  Computed(x)  =>  W I A{Compute{x),  Factor ial(n),T) 


This  states  that  if  a  value  has  already  been  computed,  then  the  action  of  computing 
it  is  irrelevant  with  respect  to  solving  a  goal.  This  advises  the  interpreter  to  store 
a  value  as  soon  as  it  is  computed.  For  the  Fibonacci  function,  the  elimination  of 
unnecessary  computation  can  be  achieved  either  by  changing  the  formulation  while 
keeping  the  problem  solver  (or  interpreter)  fixed,  or  by  changing  the  interpreter 
itself  while  keeping  the  encoding  fixed.  This  latter  possibility  is  implemented  by  the 
reduction  method  that  compiles  WIA  claims. 

3.  Eliminate  computations  that  contribute  to  higher  accuracy  that  we  don’t  care  about. 
This  is  the  basis  for  constructing  approximations.  For  instance,  the  computations 
that  result  from  having  the  base-emitter  capacitor  in  the  hybrid-  jt  model  of  a  tran¬ 
sistor,  can  be  eliminated  under  low  frequency  conditions  since  these  computations 
only  contribute  to  the  higher-order  terms  in  the  gain.  The  two  variants  of  the  com¬ 
putational  irrelevance  principle  introduced  above  can  be  weakened  by  weakening  the 
definition  of  logically  irrelevant  computation:  if  a  computation  contributes  only  to 
the  higher  order  bits  of  accuracy  we  will  call  it  irrelevant.  The  elimination  of  re¬ 
peated  computation  uses  an  exact  match  to  determine  when  two  computations  are 
identical.  We  can  relax  the  criterion  to  get  approximate  matches  and  this  would 
lead  to  the  elimination  of  almost  equivalent  computations. 

4.5.4  General  Abstractions 

A  taxonomy  of  irrelevance  abstractions  can  be  constructed  based  on  the  reduction  frame¬ 
work  described  in  this  chapter. 

•  Answer  a  subset  of  queries,  the  same  answer  needed. 

For  this  class,  the  irrelevance  claims  are  modulo  the  subset  of  queries  that  need 
to  be  preserved.  Reduction  by  these  claims  leads  to  the  generation  of  new  defined 
relations.  The  resulting  formulation  is  a  specialization  of  the  given  formulation  tuned 
to  answering  those  queries  really  efficiently.  The  kinship  examples  in  this  chapter 
are  examples  of  this. 
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•  Same  queries,  abstract  answer  suffices 

The  abstract  answers  defined  an  equivalence  class  of  solutions.  Irrelevance  claims 
propagate  equivalence  on  solution  space  into  the  formulation.  When  a  certain  prop¬ 
erty  is  deemed  irrelevant  then  all  objects  distinguishable  by  that  property  become 
indistinguishable.  Abstraction  to  remove  properties  has  the  side-effect  of  creating 
equivalence  classes  of  objects.  Examples  are  the  missionaries  and  cannibals  problem 
as  well  as  Abstrips. 

•  Same  queries  need  to  be  solved  at  the  same  level  of  detail 

Find  wasted  computation  and  state  it  as  irrelevance  claims.  Reducing  irrelevance 
usually  involves  introducing  a  new  object  or  relation  to  store  computation  that  was 
getting  repeated  before.  Examples  Me  Fibonacci,  factorial,  macrops  and  compiler 
optimisations. 


Chapter  5 


Techniques  for  Automating 
Irrelevance  Analyses 

This  chapter  proposes  reformulation  techniques  that  are  specializations  of  the  meta-level 
irrelevance  minimization  described  in  the  previous  chapters.  The  two  processes  that  need 
to  be  compiled  to  provide  effective  automation  of  reformulation  are 

1.  Discovery  of  irrelevance  claims. 

2.  Reduction  of  formulations  by  the  irrelevance  claims. 

We  analyze  various  types  of  irrelevance  reformulations  and  describe  the  special  prop¬ 
erties  of  the  irrelevance  claims  that  allow  for  the  development  of  tractable  algorithms  for 
the  discovery  and  reduction  of  irrelevance. 

5.1  Intermediate  variable  elimination 

The  key  idea  explored  here  is  optimising  computations  by  eliminating  intermediate  com¬ 
putation.  We  recognize  intermediate  computation  and  determine  if  it  is  irrelevant.  These 
are  stated  as  computational  irrelevance  claims  in  the  meta-theory.  We  then  rewrite  the 
computation  so  that  only  endpoints  are  preserved.  This  is  implemented  by  reduction 
inferences  that  shortcircuit  intermediate  computation. 

A  simple  example  where  reasoning  chains  can  be  compressed  without  violating  cor¬ 
rectness  constraints  is  the  following. 
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Example  9  The  set  T\  =  {p(x,y)  A  q(y,z)  A  r(z,w )  =>  s(w )}  can  be  reformulated  as 
follows  by  introducing  a  new  intermediate  relation  that  eliminates  the  thread  through  the 
variable  y. 

Ti  =  {t(x,z)  A  r(z,w)  ^  s(w)}  where  the  articulation  theory  A  between  T\  and  T2  is 
{p(x,y)Aq(y,z)  =>  t(x,z)}. 

The  correctness  proof  for  this  reformulation  is  trivial.  In  the  first  place,  T\ ,  A  f=  T2 
and  T2,A  (=  T\.  Also,  it  is  the  case  that  Vu>.  T\  s(tn)  =  T2  |=  s(iu)  for  the  same 
extensional  database  (same  extensions  for  given  predicates  p,  q  and  r).  The  difference 
in  the  two  programs  surfaces  in  the  efficiency  proof.  In  the  proof  space,  we  reduce  the 
branching  factor  on  the  s(  w)  node  by  1  by  introducing  the  pre-computation  of  the  join  of  p 
and  q.  Essentially,  we  prevent  recomputation  of  p( x,  y)  Ag(y,  z )  over  various  instantiations 
of  the  goal  s(w).  y  is  a  thread  variable  that  does  not  occur  in  the  final  query.  The  role 
of  z  is  also  to  act  as  an  intermediary  in  the  computation  of  s(w).  One  idea  is  to  deem 
all  intermediates  irrelevant  -  this  will  have  the  effect  of  minimizing  intermediate  variable 
threading,  by  precomputing  joins  of  relations  and  materializing  those  joins.  If  only  the 
endpoints  matter,  then  store  them  and  throw  the  intermediate  stuff  away.  This  is  same 
idea  behind  chunking  and  macrops. 

The  order  of  thread  variable  elimination  determines  the  size  of  the  intermediate  re¬ 
lations  introduced.  The  thread  variables  that  we  eliminate  depend  on  the  queries  to  be 
preserved,  the  sizes  of  relation,  and  the  sizes  of  intermediate  relations  created.  In  the 
simplest  cases,  the  threads  variables  are  immediately  apparent  in  the  encoding.  Almost 
all  of  the  dynamic  programming  examples  have  this  property.  However,  recognition  of 
irrelevant  intermediate  computation  is  far  from  easy.  The  recognition  that  the  threading 
through  Ancestor  could  be  eliminated  in  the  kinship  example  is  a  non-trivial  one. 

We  now  perform  an  analysis  using  irrelevance  for  thread  varibale  elimination.  There 
are  two  possible  analyses. 

Analysis  1 

Vz  G  T  =>  WIA(Compute(x),g,T) 

Captures  the  fact  that  if  a  fact  x  is  in  the  database,  the  action  of  computing  it  is 
irrelevant.  It  assumes  that  x  is  an  eternal  fact  (does  not  change  with  time).  Compute(x) 
reifies  the  search  tree  for  x  rooted  at  x.  The  search  tree  for  x  is  expanded  using  the  rules 
in  T. 
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The  semantics  of  this  irrelevance  claim  is:  even  if  the  action  Compute(x)  is  not  per¬ 
formed,  g  will  be  solvable  in  the  T  as  long  as  x  is  in  it.  It  prescribes  that  the  action 
Compute(x)  be  not  performed.  Instead  to  obtain  the  value  of  x,  a  lookup  in  the  database 
will  be  performed. 

Analysis  2 

The  irrelevance  claim  for  Example  9  is 

t(x,2)  e  T  =>  CI(p(z,y),g(m,n),T)ACI(q(y,z),g(m,n),T) 

A  gross  step  in  the  computation  (from  x  to  z),  renders  its  component  steps  computa¬ 
tionally  irrelevant.  In  fact,  the  irrelevance  claim  in  Analysis  1  is  a  special  case  of  this;  since 
the  two  options:  looking  up  a  value  and  computing  it  are  compared  and  the  gross  step 
of  looking  up  a  stored  value  (bypassing  the  computation)  renders  the  steps  that  compute 
the  value  irrelevant. 

The  reduction  method  for  this  class  of  abstractions,  simply  adds  the  new  relation  on 
the  LHS  of  the  irrelevance  claim,  and  removes  the  component  relations  from  which  it  is 
derived. 

The  complexity  of  doing  thread  variable  elimination  in  cases  where  the  threads  are 
recursive  is  best  illustrated  by  the  kinship  example.  Here  are  some  comparisons  between 
the  kinship  example  and  the  p-q  example.  The  main  similarities  are  that  in  both  cases 

•  a  new  relation  is  introduced  that  is  definable  entirely  in  terms  of  the  existing  rela¬ 
tions. 

•  a  certain  class  of  queries  is  preserved  across  the  transformation. 

•  the  set  of  objects  is  not  changed. 

•  the  new  relations  FoundingFather  and  t  are  good  islands  of  pre-computation. 

The  main  differences  are 

•  the  new  relation  FoundingFather  is  a  subset  of  the  given  Ancestor  relation.  So  a 
problem  solver  that  simply  caches  Ancestor  facts  would  ultimately  obtain  the  links 
that  make  up  the  new  relation,  t  is  not  a  subset  of  an  existing  relation.  The  effect 
of  introducing  the  relation  symbol  t  cannot  be  duplicated  by  caching  ground  atomic 
literals. 

•  t  is  a  simple  macro-operator,  whereas  FoundingFather  is  not. 
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•  introducing  the  FoundingFather  relation  symbol  eliminates  redundancy  in  the  space  of 
proofs  (now  every  SameFamily  fact  has  exactly  one  proof),  whereas  the  introduction 
of  t  does  not. 

The  method  of  eliminating  thread  variables  can  be  applied  to  any  state-space  problem 
formulated  as  follows. 

Legal(s\,  s)  A  Reachable(s,  S2)  =>  Reachable(s,S2) 

This  states  that  state  sj  is  reachable  from  sj  if  there  is  a  legal  move  from  to  an 
intermediate  state  3  from  where  32  is  reachable.  Now  we  need  to  minimize  the  Reachable 
relation  while  preserving  paths  of  interest  to  prevent  unnecessary  intermediate  computa¬ 
tion. 

5.1.1  Relations  to  previous  work 

Comparison  to  partial  evaluation:  the  general  techniques  of  partial  evaluation  involve 
propagation  of  constants  and  unfolding  calls  by  their  definitions.  Generally,  they  do  not 
make  short  cuts  in  the  computation  sequence  or  change  program  control  to  eliminate 
redundant  computation:  both  of  these  are  done  by  the  methods  to  automate  irrelevance 
reformulations. 

Comparison  with  macrops:  justifications  for  short-circuiting  steps  in  search  space  or  proof 
space  is  implicit  in  the  creation  of  macrops.  The  method  of  irrelevance  minimization  first 
introduces  redundant  macro  steps  and  then  eliminates  the  constituent  micro  steps.  The 
macrops  question  is:  given  a  state  space  what  are  the  best  short  circuits  to  construct 
for  solving  a  given  class  of  goals  efficiently?  The  abstraction  problem  with  irrelevance 
minimization  is:  what  steps  need  to  be  added  and  what  can  be  thrown  away  in  the 
service  of  a  given  class  of  goals.  Note  that  a  system  using  macrops  needs  to  learn  control 
knowledge  to  make  the  problem  solver  choose  appropriate  short  circuits.  The  irrelevance 
minimization  method  analytically  computes  this  knowledge  and  compiles  it  into  the  base 
theory. 


5.2  Irrelevance  analysis  of  abstraction  in  circuits 


The  aim  is  to  do  an  irrelevance  analysis  of  some  classic  abstractions  in  digital  and  analog 
circuits.  As  an  example  from  analog  circuits,  we  use  the  theory  of  irrelevance  to  derive  the 
concept  of  Thevenin  equivalents.  This  analysis  helps  to  clarify  the  computational  utility 
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Figure  5.2:  Proof  of  the  goal  IAc 


of  forming  Thevenin  equivalents  of  circuits  in  terms  of  the  search  space  transformation 
it  entails.  As  an  example  from  digital  circuits,  we  construct  the  structural  abstraction 
of  a  digital  device  [Sin86].  This  reformulation  effectively  hides  detail  about  intermediate 
computation  in  the  circuit. 

5.2.1  Thevenin  Equivalents 

In  the  simple  resistive  network  of  Figure  5.1,  the  goal  is  to  compute  the  current  IAc 
through  the  voltage  source  V.  The  standard  method  of  solution  uses  Ohm’s  and  KL  choff’s 
laws.  The  proof  tree  that  solves  this  goal  is  in  Figure  5.2. 

To  meet  our  computational  constraint  of  solving  for  IAc  in  constant  time  with  a  con¬ 
stant  overhead  in  space,  we  decide  to  shorten  proofs  of  IAc  Terminating  the  proof  at 
height  1  satisfies  the  constraint  on  time.  To  make  sure  that  all  future  instances  of  proofs 
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Figure  5.3:  The  Thevenin  equivalent  reformulation 


of  I ac  are  1  step  long,  we  propagate  the  proof  tree  transformation  into  the  formulation 
by  replacing  the  R\,R2  series  combination  of  resistors  by  the  equivalent  Rac-  The  intro¬ 
duction  of  this  new  object  achieves  the  effect  of  reducing  the  height  of  the  proof  for  this 
and  future  instances  of  proofs  of  Iac  on  this  circuit.  This  transformation  is  equivalent  to 
replacing  an  entire  subtree  in  the  computation  by  a  single  node  and  reifying  the  value  of 
the  node  in  a  new  object. 

We  can  give  an  account  of  the  introduction  of  a  new  object  in  terms  of  reduction  of 
irrelevance.  The  value  of  the  voltage  at  the  point  B  is  irrelevant  to  the  computation  of  the 
current  through  the  circuit.  So  we  can  weaken  the  initial  formulation  to  make  the  value 
of  the  voltage  at  B  underivable  without  affecting  the  value  of  Iac-  The  irrelevance  claim 
is: 

WI{VabJac,T)  a  WI(Vbc,Iac,T) 

Reducing  the  formulation  to  make  these  claims  underivable  in  the  meta- theory  of 
the  new  formulation  requires  introducing  the  compound  object  Rac  =  &ab  +  Rbc  and 
rewriting  the  formulation  as  in  Figure  5.3. 

The  form  of  the  reduction  inference  is: 

Given: 

•  the  detailed  circuit  description 

•  Ohm’s  and  Kirchoff's  laws 


•  Description  of  goal  schema 
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•  the  irrelevance  claims  with  respect  to  the  goal  schema 

Find  a  new  abstract  description  that  does  not  compute  the  values  specified  as  irrelevant 
by  the  given  claims. 

5.2.2  Full  Adders 

Consider  a  digital  device  specified  as  follows:  IMP(a,x,y,g )  s  Pi(a,x,y,x)A  P2{y,y,g)- 
To  solve  for  values  of  g  given  values  of  a,  we  have  to  compute  the  internal  values  x  and 
y.  However,  as  far  as  the  output  value  g  is  concerned,  the  value  at  the  intermediate 
points  is  irrelevant  and  can  be  abstracted.  This  requires  weakening  the  theory  of  the 
device  by  introducing  a  new  predicate  that  existentially  quantifies  the  internal  signals. 
New  -  Pred(a,p)  =  3 x,y  IMP(a,x,y,g).  Notice  that  the  new  predicate  introduced  hides 
information  about  the  internal  structure  of  the  device  and  preserves  its  i-o  behaviour. 
Singh  [Sin86]  calls  these  abstractions  structural  abstractions.  At  the  level  of  proof  trees, 
this  amounts  to  shortening  the  proofs  to  height  0  by  essentially  precomputing  the  i-o 
behaviour  as  a  table  in  the  relation  New-Pred. 

In  the  case  of  the  full  adder,  shown  in  Figure  2.1,  we  can  derive  the  structural  ab¬ 
straction  that  eliminates  details  of  the  actual  connections  between  the  various  gates,  by 
specifying  the  values  at  the  internal  points,  viz.,  d,  e  and  /  irrelevant.  To  reduce  the 
theory  with  respect  to  these  irrelevance  claims,  we  need  to  weaken  it  to  make  the  values 
at  these  points  underivable  in  the  new  theory.  The  introduction  of  the  new  predicate 
Full  -  Adder(a,  b,  c,  sum,  carry)  ~  sum  =  zor(a,  6,  c)  A  carry  =  ab  +  bc  +  ca  and  removing 
the  original  description  of  the  full  adder,  makes  the  values  at  d,e  and  /  no  longer  deducible. 
There  are  two  major  steps  in  the  abstraction:  identifying  which  points  in  the  circuit  to 
treat  as  internal  points  (since  sum  and  carry  are  the  outputs  we  are  interested  in,  and 
a,  b  and  c  are  the  given  inputs,  all  other  nodes  in  this  circuit  become  internal  nodes  by 
elimination),  and  derivation  of  the  expressions  for  sum  and  carry  purely  in  terms  of  the 
givens:  which  is  an  algebraic  rewrite  problem. 
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5.3  The  Irrelevance  theorem  prover 

5.3.1  The  propositional  version 

We  now  present  am  implementation  of  a  propositional  logic  of  irrelevance  that  allows  us  to 
calculate  irrelevancies  in  time  linear  in  the  size  of  the  dependency  network.  The  basic  idea 
is  the  following:  have  a  dependency  network  at  a  stable  configuration  with  truth  values 
attached  such  that  each  node  has  values  that  make  it  true.  We  then  perturb  the  network, 
by  changing  some  value  from  true  to  false,  or  false  to  true.  Then  we  propogate  the  effects 
of  this  through  the  network  with  local  propagation  rules  that  calculate  the  partial  discrete 
derivatives  of  these  propositions. 

5.3.2  The  Implementation  of  irrelevance  detection  by  hardware 

The  basic  idea  is  to  convert  a  set  of  sentences  into  a  combinatorial  circuit  so  it  can  answer 
questions  of  the  form:  is  f  irrelevant  to  g  in  the  following  manner.  We  perturb  the  value 
of  f  in  the  circuit  (if  f  is  an  internal  point,  we  have  to  modify  the  input  appropriately,  to 
achieve  the  change  a  f)  and  propagate  the  consequences  down  to  g. 

We  can  do  this  only  if  the  theory  is  finite,  there  are  no  cyclic  dependencies  in  the 
circuit  and  that  there  are  no  recursive  rules. 

The  propagation  rules  are  (f,h  are  inputs  to  a  gate,  g  is  the  output) 

1.  AND:  if  h  =  1  then  £(0 — ►  1)  =>  g(0-»l) 

2.  AND:  if  h  =  1  then  f(l-*0)  =>  g(l— »0) 

3.  OR  :  if  h  =  0  then  f(0->l)  =>  g(0->l) 

4.  OR  :  if  h  =  0  then  f(l-»0)  =>  g(l-0) 

5.  NOT:  if  f(l— 0)  =>  g(0-l) 

6.  NOT:  if  f(0 — ►!)  =>  g(l— 0) 


More  complicated  f’s  can  be  reduced  to  atomic  ones  and  the  above  propagation  methods 
applied. 
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5.4  The  irrelevance  reduction  system 

5.4.1  Extensional  Reduction 

The  extensional  reduction  algorithm  takes  the  following  inputs: 

1.  A  database  of  facts  and  rules  that  describes  a  problem 

2.  The  goal 

3.  The  irrelevance  claims  for  that  goal 

and  produces  a  new  database  of  facts  and  rules  that  preserves  answers  to  the  goal.  The 
irrelevance  claims  are  no  longer  true  of  the  new  database. 

The  algorithm  is: 

Algorithm  Reduce 

Inputs:  Theory  T  to  be  reformulated 

Irrelevance  claims  of  the  form  Condition  *>  CI(f,g,T) 

Output:  New  theory  where  irrelevance  claims  are  no  longer  trivially  true 

Temp-Theory  ■  Emptyset. 

T  »  Input  Theory  ;  initialization 

Repeat  until  T  and  Temp-Theory  are  the  same 
Temp-Theory  *  T. 

Find  an  irrelevance  claim  where  f  unifies  with  a  fact  in  T. 

If  it  is  a  conditional  claim  then 

Find  augmentation  action  to  make  condition  true 
Perform  augmentation  action. 

Find  revision  action  to  make  f  under ivable. 

Perform  revision  action. 

End  Repeat 

Augmentation  actions  indexed  by  the  form  of  the  condition  to  be  achieved  are  coded  as 
condition- action  rules.  Revision  lemmas  that  prescribe  particular  delete  actions  are  the  as- 
sertional  component  of  the  irrelevance  calculus.  We  illustrate  one  step  of  reduction  by  IC1 
on  a  database.  Suppose  T  =  { Father(  A,  i?),Father(  Father(x,  y)  ^  Ancestors,  y)}. 

ICl:  Vxymn.  Ancestor(z,  y)  G  T  =>  C7(Father(x,y),SameFamily(m,n),r). 
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Name  :  ICl  (&z  &y) 

Condition  :  Father (&x  &y)  in  T 
Addlist  :  Ancestor (&x  &y) 

Deletelist  :  Father (&x  &y) 

Figure  5.4:  The  STRIPS  operator  compiled  from  Irrelevance  Claim  1 

If  we  follow  the  steps  of  the  algorithm  above  after  the  appropriate  initializations; 

1.  We  select  ICl  with  bindings  {x  =  A,y  =  B). 

2.  We  now  have  the  goal  of  augmenting  the  theory  to  make  Ancestor(A,  B)  6  T  true. 
We  use  the  fact  an  €-of  condition  can  be  made  true  by  adding  the  fact  named  in  the 
condition  to  the  theory.  We  then  perform  the  action  of  adding  Ancestor(A,  B)  to  T. 

3.  Now  the  goal  is  to  revise  the  augmented  theory  to  make  it  not  entail  Father(  A,  B). 
We  use  the  fact  that  Father  is  a  primitive  relation  (it  is  a  source  node  in  the  defin¬ 
ability  map)  and  the  fact  that  revising  a  theory  to  make  a  primitive  fact  under ivable 
requires  simply  removing  it.  We  then  perform  the  action  of  removing  Father(  A,  B). 

Note  that  the  reasoning  steps  required  to  establish  the  augmentation  and  revision  methods 
are  repeated  for  every  instantiation  of  Father(z,  y)  in  the  database.  We  compile  out  this 
reasoning,  by  performing  them  once  and  for  all  at  the  start  of  the  reduction  process.  The 
result  is  a  STRIPS  operator  that  achieves  the  same  effect  as  theorem  proving  on  the  Cl 
claim:  the  addlist  contains  the  result  of  deliberation  about  the  augmentation  action  and 
the  deletelist  the  resulting  revision  action.  The  STRIPS  operator  compiled  out  of  ICl  is 
shown  in  Figure  5.4. 

5.5  Search  Control  Issues 

We  need  abstract  representations  of  encodings  to  provide  good  search  control  for  instan¬ 
tiating  Cl  and  WI  lemmas  for  the  verification  and  generation  of  irrelevance  claims.  We 
introduce  a  structure  called  a  defmedness  graph  that  succintly  captures  the  definability 
relationships  between  the  conceptual  primitives  in  a  particular  encoding. 


I 
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Figure  5.5:  The  definedness  graph  for  encoding  E\  of  the  kinship  problem 
5.5.1  Definedness  Graphs 

Definition  25  A  definedness  graph  (V,E)  of  an  encoding  Ec  of  a  conceptualization  C 
={0,T,TI)  is  a  directed  AND-OR  graph  with  vertex  set  V={0,F,R}  where  0,F  and  R 
are  elements  denoting  O, ?, "R  respectively,  and  the  edge  set  E  =  {ft;,,  -,Vjk})  | 

Vi  €  V  is  defined  in  terms  of  the  elements  of  , . . . ,  vJ(k}  in  V). 

The  definedness  graph  for  the  encoding  E\  of  the  kinship  problem  is  shown  in  Fig¬ 
ure  5.5.  SameFamily  is  defined  in  terms  of  Ancestor.  Father  is  a  primitive  relation  symbol 
since  it  is  a  sink  node  in  the  graph.  The  definedness  graph  makes  it  clear  that  the  only  role 
of  Father  in  this  encoding  is  to  define  Ancestor.  Ancestor  itself  is  grounded  in  Father.  The 
recursive  part  of  the  definition  of  Ancestor  is  captured  by  the  ( Ancestor, ( {Ancestor, Father}) 
edge.  Notice  the  AND-arc  that  captures  this  graphically  in  Figure  5.5. 

Notice  that  the  definability  relationship  is  established  in  a  particular  encoding.  A 
definedness  graph  hides  details  of  the  actual  definition,  it  only  preserves  the  fact  that  a 
particular  element  in  an  encoding  can  be  defined  in  terms  of  others.  If  for  instance,  the 
Ancestor  relationship  had  been  defined  as 

Ancestor(x,y)  <=  Father(x,  z)  A  Ancestor(z,y) 

we  would  still  have  the  same  definedness  graph  as  above.  Thus  these  graphs  are  really 
abstractions  of  particular  encodings. 

The  graph- theoretic  interpretation  of  definability  in  terms  of  definedness  graphs  allows 
us  to  construct  them  given  the  encodings.  The  definability  checks  needed  to  determine  if 
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Figure  5.6:  The  definability  map 


one  conceptualization  makes  fewer  distinctions  than  another  can  be  translated  to  path- 
finding  problems  on  these  graphs.  We  provide  algorithms  and  complexity  results  for  these 
operations. 

Theorem  17  The  definability  graph  for  Horn  clause  encodings  can  be  constructed  in  time 
linear  in  the  length  of  the  encoding. 

Proof:  We  give  a  constructive  proof.  For  every  sentence  in  the  encoding  of  the  form 
h  <=  bi,b-2,...,b„,  we  add  to  E  the  edge  (h,  B)  where  B  =  U"=1  if  it  doesn’t  exist  there. 
We  add  h  and  the  b,'s  to  the  vertex  set  V.  This  construction  looks  at  every  sentence  only 
once  and  is  thus  linear  in  the  size  of  the  encoding.  □. 

Suppose  we  have  the  encodings  of  two  conceptualizations  C\  and  C2  a s  well  as  their 
articulation  theory  A  represented  as  definedness  graphs.  Figure  5.6  shows  an  example. 
To  determine  whether  an  element  c  in  C\  is  definable  in  terms  of  the  elements  in  C2  we 
check  whether  there  is  a  path  from  elements  in  Cj  to  c  in  the  definability  map. 

To  present  the  path-finding  algorithm,  we  need  to  distinguish  three  types  of  nodes  in 
the  definedness  graph 

1.  Selfloop- Nodes:  Vi  €  V  that  have  an  edge  (vu  B)  where  v*  €  B. 

2.  And-Nodes:  t),  6  V  that  have  an  edge  (v<,  B )  where  B  is  a  non-empty,  non-singleton 
set. 

3.  Singleton-Nodes:  u,  6  V  that  have  an  edge  (t>j,  B)  where  £  is  a  singleton  set. 

These  are  not  mutually  exclusive  categories. 
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Theorem  18  When  the  only  cycles  in  a  definedness  graph  G  are  those  due  to  selfloop- 
nodes,  the  complexity  of  determining  whether  a  node  c  in  G  is  defined  in  terms  of  another 
set  of  nodes  N  in  G  is  polynomial  in  the  number  of  edges. 

Proof:  We  provide  a  constructive  proof.  We  start  with  the  set  In-Set  that  initially 
contains  the  elements  in  N .  We  add  a  node  u,  to  this  set  if  it  is  not  already  there  when 
there  is  an  edge  (t>,,  B)  in  E  where  B  CIn-Set.  For  an  And-node  this  means  that  all  the 
nodes  that  define  it  have  to  present  in  In-Set.  We  can  add  a  selfloop-node  only  if  there 
is  an  edge  ( Vi,B )  in  E  where  «,->  6  B.  This  requirement  guarantees  that  all  recursive 
definitions  are  grounded.  The  addition  of  nodes  to  In-Set  stops  when  c  is  added  to  In-Set 
in  which  case  we  declare  that  c  is  defined  in  terms  of  N  in  G,  or,  no  more  additions  can 
be  made,  in  which  case  c  is  not  defined  in  terms  of  JV  in  G.  When  G  has  no  selfloop- 
nodes,  this  algorithm  looks  at  every  edge  exactly  once  and  is  thus  linear  in  the  number 
of  edges.  If  there  are  m  selfloop-nodes  then  the  algorithm  makes  m  passes  over  the  edge 
list  in  trying  to  satisfy  the  groundedness  condition.  When  there  are  non-trivial  cycles  in 
the  definedness  graph  (these  correspond  to  mutually  recursive  definitions),  this  algorithm 
becomes  exponential  in  the  number  of  edges  in  G.  □. 

To  see  how  this  algorithm  works,  consider  the  problem  of  determining  whether  Found- 
ingFather  can  be  defined  in  terms  of  Father  in  Figure  5.6.  We  start  with  In-Set  initialized 
to  {Father}.  Since  we  have  the  edge  (Ancestor,  {Father})  in  E ,  and  Father  is  present  in 
In-Set,  we  add  Ancestor  to  In-Set.  Since  (Ancestor, {Ancestor})  is  in  E,  and  the  ground¬ 
edness  condition  for  Ancestor  is  satisfied  by  the  previous  edge,  we  can  add  Ancestor  to 
In-Set  but  it  is  already  there!  Using  the  edge  (FoundingFather, {Ancestor})  in  E,  we  can 
add  FoundingFather  to  In-Set  since  Ancestor  is  already  in  it.  Now  the  algorithm  termi¬ 
nates  successfully  reporting  that  FoundingFather  has  a  definition  in  terms  of  Father  in  the 
encoding  represented  by  the  definedness  graph. 

The  definability  map  allows  us  to  analyze  the  conceptualization  intensionally.  One  way 
to  think  about  the  elements  of  a  conceptualization  is  as  the  elements  of  a  basis  set.  Using 
the  construction  in  the  theorem  above,  we  can  determine  if  there  is  redundancy  in  the 
conceptualization  itself.  If  we  can  define  parts  of  a  conceptualization  from  other  parts  of  it, 
we  will  call  that  conceptualization  redundant.  Efficiency  issues  can  be  discussed  by  cutting 
the  conceptualization  by  a  p-d  cut.  The  items  on  the  p  side  of  the  cut  will  be  extensionally 
stored  in  the  encoding.  The  items  of  the  d  side  will  be  derived  from  the  extensionally  stored 
items.  Note  that  the  definability  analysis  is  meta  to  the  conceptualization  itself.  We  can 
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Figure  5.7:  The  Architecture  of  a  Reformulator 

determine  a  range  of  encoding  shifts  by  simply  defining  the  possible  cuts  of  the  definability 
map.  Setting  this  up  as  a  combinatorial  optimisation  problem  is  an  interesting  problem 
for  future  research. 


5.6  The  Architecture  of  a  Reformulator 

The  discovery  and  reduction  components  are  put  together  as  indicated  in  Figure  5.7. 
Only  the  empirical  method  of  discovery  has  been  programmed:  currently  the  discovery 
module  takes  the  given  formulation  and  the  correctness  and  goodness  constraints  as  in¬ 
put  and  analyzes  the  computation  to  generate  irrelevance  claims  that  are  correct  (so  the 
goal  will  be  preserved  in  the  reformulation)  and  good  (so  the  computation  of  the  goal 
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respects  the  computational  goodness  constraints).  The  reduction  process  takes  the  meta- 
theoretical  irrelevance  claims  and  either  performs  extension?!  reduction  to  generate  the 
new  formulation  directly,  or  intensional  reduction  to  produce  the  necessary  definitions  of 
new  predicates  and  objects.  A  rewrite  system  then  generates  the  new  formulation  in  terms 
of  these  definitions.  At  present,  the  reduction  component  has  been  implemented  in  the 
meta-level  programming  system  MRS[Gen83b).  It  has  been  tested  on  the  Founding  Fa¬ 
ther  example  and  a  few  variants  of  it.  The  Verifier  has  succesfully  verified  the  irrelevance 
claims  presented  in  this  thesis.  Extending  the  capabilities  of  the  verifier  requires  adding 
powerful  lemmas  to  the  irrelevance  calculus.  The  discovery  component  is  implemented  as 
a  set  of  demons  that  look  for  regularities  in  the  symbolic  computation  trace.  Extending 
the  discovery  component  requires  programming  in  more  general-purpose  regularity  detec¬ 
tors  in  computational  traces.  The  two  proof  reconfiguration  methods  have  been  used  to 
generate  Thevenin  equivalents  of  analog  circuits. 


Chapter  6 


Conclusions  and  Future  Work 


6.1  Summary  of  Contributions 

Research  in  reformulation  is  purely  exploratory  in  nature  because  our  knowledge  is  too 
poor  to  even  formulate  the  appropriate  questions,  let  alone  solve  them.  The  main  con¬ 
tribution  of  this  thesis  is  a  conceptual  framework  in  which  relevant  questions  about  re¬ 
formulations  for  computational  efficiency  can  be  phrased  and  answered.  The  framework 
attempts  to  combine  a  way  of  looking  at  things  that  has  powerful  heuristic  value  with  a 
collection  of  mathematical  definitions  and  theorems  that  can  be  used  for  rigorous  deriva¬ 
tion  of  results.  A  Type  1  theory  that  captures  regularities  in  the  space  of  reformulations 
has  been  uncovered.  The  theory  can  analyze  reformulations  by  justifying  them,  and  in 
some  cases  generate  them.  The  chief  result  from  the  analysis  is  the  development  of  a  2-step 
meta-theoretic  method  for  the  automation  of  abstraction  reformulations  for  computational 
efficiency. 

1.  Generate  irrelevance  claims  that  are  true  of  the  given  formulation  with  respect  to 
the  given  class  of  goals. 

2.  Reduce  the  formulation  by  the  irrelevance  principle. 

Two  general  methods  for  the  generation  of  irrelevance  claims  have  been  investigated:  de¬ 
riving  them  in  the  meta-theory  by  backward- chaining  on  the  lemmas  in  the  calculus  of 
irrelevance,  as  well  as  by  empirical  analysis  of  proof  and  search  spaces.  Reduction  meth¬ 
ods  have  been  developed  that  generate  a  large  class  of  reformulations:  the  generation 
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of  Thevenin  equivalents,  the  transformation  of  the  Fibonacci  computation  to  tail  recur¬ 
sion,  several  compiler  optimizations,  encoding  variations  of  the  FoundingFather  example, 
macro-operators  and  macro-objects. 

Effective  mechanization  of  this  method  has  been  accomplished  on  the  class  of  abstrac¬ 
tion  reformulations  called  elimination  of  intermediates.  The  irrelevance  claims  for  this 
class  have  a  very  special  form:  a  gross  step  in  the  computation  renders  an  intermediate 
step  irrelevant.  The  meta-theoretic  irrelevance  lemmas  of  the  irrelevance  calculus  allow 
us  to  generate  these  claims  in  a  goal-directed  fashion.  The  reductions  also  have  a  spe¬ 
cific  form:  they  either  involve  wholesale  removal  of  relations,  or  the  introduction  of  new 
relations  and  objects  that  are  definable  in  terms  of  the  existing  ones. 

The  theory  has  been  empirically  verified  by  constructing  a  prototype  of  a  first- principles 
reformulator  whose  architecture  is  shown  in  Figure  5.7.  It  has  been  tried  on  the  kinship 
example  and  its  variants,  as  well  as  the  generation  of  Thevenin  equivalents  of  analog  cir¬ 
cuits.  An  interesting  empirical  fact  was  that  the  irrelevance  claims  used  to  discover  the 
FoundingFather  relation  have  the  same  form  as  the  ones  used  to  derive  the  concept  of  a 
Thevenin  equivalent  of  a  circuit  from  Kirchoff’s  and  Ohm’s  laws.  Historically,  the  con¬ 
cept  of  Thevenin  equivalents  was  discovered  almost  100  years  after  Kirchoff’s  laws  became 
known.  Our  reformulator  derived  the  notion  of  Thevenin  equivalents  using  the  rather 
limited  set  of  discovery  and  reduction  methods  at  its  disposal. 

This  shows  that  the  framework  explored  in  this  thesis  has  the  potential  to  unify  dis¬ 
parate  abstraction  phenomena  as  instances  of  a  powerful  invariant  in  granularity  shifts: 
the  most  economical  description  is  the  one  that  uses  concepts  of  the  largest  granularity 
consistent  with  the  correctness  constraints.  The  ability  to  generate  reformulations  from 
such  basic  considerations  can  have  significant  impact  not  only  on  AI,  but  also  in  improving 
our  current  stock  of  scientific  concepts  and  formulations. 

6.2  Evaluation 

6.2.1  The  nature  of  irrelevance  reformulations 

Reformulations  involve  modifying  both  conceptualizations  (or  models)  and  encodings.  The 
actual  end  product  of  a  reformulation  is  an  encoding  (or  theory)  with  better  computational 
properties.  The  semantics  of  the  theory  revision  is  explained  in  terms  of  the  effects  on 
its  models.  However,  not  all  encoding  shifts  can  be  explained  in  terms  of  models.  If  for 
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instance,  we  replace  the  expression  a  A  6  by  the  expressions  b  A  a,  this  replacement  might 
have  computational  impact,  (a  might  be  false  most  of  the  time  and  might  be  cheaper  to 
compute),  however  it  does  not  change  the  models  of  the  theory.  Conjunct  orderings  of  this 
form  are  an  instance  of  theory  revision  that  is  not  visible  at  the  level  of  models.  This  is  not 
surprising,  because  models  only  characterize  truth  without  considering  the  cost  involved  in 
computing  truth  values.  In  sum,  reformulations  at  the  encoding  level  are  a  description  of 
the  phenomenon  at  a  fairly  fine-grained  level.  The  definition  of  reformulation  at  the  level 
of  conceptualizations  is  a  coarser  grained  description.  The  irrelevance  principle  states  that 
minimizing  distinctions  subject  to  the  correctness  constraints  leads  to  the  construction  of 
more  efficient  theories. 

Unfortunately,  not  all  efficiency  improvements  on  theories  require  minimization  of 
distinctions.  In  fact,  making  algorithms  efficient  often  involves  making  more  distinctions. 
For  instance,  quick  sort  does  clever  bookkeeping  compared  to  the  simple  minded  bubble 
sort  algorithm.  The  worst  case  time  for  both  algorithms  are  the  same,  and  a  very  fine 
grained  average  case  complexity  analysis  is  needed  in  order  to  explain  why  quicksort  does 
better  than  bubble  sort.  This  reformulation  from  bubble  sort  to  quicksort  is  an  encoding 
level  shift  as  opposed  to  a  conceptual  shift,  because  both  axe  sorting  algorithms  at  the 
specification  level.  However,  we  can  write  irrelevance  claims  on  the  computation  trace, 
the  reduction  of  the  formulation  by  these  claims  then  requires  making  finer  distinctions 
rather  than  removing  them. 

There  are  limitations  introduced  by  the  specific  irrelevance  discovery  and  reduction 
methods  discussed  in  this  thesis.  The  subset  irrelevance  definition  is  not  powerful  enough 
to  explain  and  generate  Type  2  abstractions.  The  set  abstraction  in  the  missionaries  and 
cannibals  is  an  example  of  this  class.  Refinement  reformulations  that  involve  introducing 
primitives  that  are  not  definable  in  terms  of  what  is  known,  are  impossible  to  automate 
using  irrelevance  reformulations. 

6.2.2  The  power  of  the  meta-theoretic  approach 

The  method  of  irrelevance  minimization  is  an  analytical  technique  for  learning  new  vocab¬ 
ulary  terms  for  the  purpose  of  improving  the  efficiency  of  computation.  This  is  done  by  an 
analysis  of  the  task  requirements  and  modifying  old  representations  to  meet  more  specific 
goals.  The  knowledge  brought  to  bear  on  this  process  includes  that  of  representations,  the 
problem  solver  and  the  purpose  of  the  representation.  The  calculus  of  WI  and  Cl  allow 


CHAPTER  6.  CONCLUSIONS  AND  FUTURE  WORK 


120 


us  intensional  specification  of  the  properties  of  the  conceptualization  and  the  computa¬ 
tion.  The  meta- theoretic  irrelevance  principle  then  creates  minimal  conceptualizations 
with  encodings  that  perform  minimal  computation.  The  tools  for  describing  conceptual¬ 
izations,  proof  and  search  spaces  along  with  the  well  defined  methods  of  modifying  them 
in  goal-sensitive  ways  are  critical  for  making  this  method  of  reformulation  feasible. 

6.2.3  Issues  in  Validation 

In  his  insightful  commentary  on  the  field,  Marr  declared  that  the  goal  of  AI  is  to  study 
useful  information  processing  problems  and  to  propose  an  abstract  account  of  how  to 
solve  them.  A  result  in  AI  consists  of  the  isolation  of  a  particular  information  processing 
problem  and  the  statement  of  a  method  for  solving  it. 

The  problem  of  reformulating  representations  to  make  them  computationally  efficient 
with  respect  to  a  set  of  goals  has  been  isolated.  A  clean  Type  1  theory  that  describes  a 
normative  principle  that  governs  the  class  of  abstraction  reformulations  has  been  discov¬ 
ered.  A  partial  inversion  of  this  principle  that  generates  abstractions  automatically  has 
also  been  accomplished. 

The  methodology  adopted  involved  doing  theory  before  practice  because  the  phe¬ 
nomenon  was  too  ill  understood  to  benefit  from  programming.  An  attempt  to  program 
would  only  have  led  to  the  development  of  special  methods  that  work  in  a  specific  domain. 
The  goal  of  the  thesis  was  to  obtain  an  understanding  of  the  reformulation  phenomenon 
well  enough  to  make  it  feasible  to  suggest  good  reformulations  in  a  variety  of  domains. 
As  is  necessary  in  a  ground  breaking  study,  a  fruit  fly  was  needed  to  fuel  the  research:  an 
example  that  was  simple  enough  to  do  paper  and  pencil  analyses  on,  and  complex  enough 
to  contain  the  essential  difficulties  of  automating  reformulation.  The  kinship  example  was 
chosen  because  it  satisfied  both  criteria.  Since  the  introduction  of  the  FouadingFather 
relation  symbol  is  a  general  graph- theoretic  transformation,  its  derivation  would  apply  to 
almost  any  graph  search  problem,  and  all  formulations  of  AI  problems  in  the  state  space 
model  fit  this  framework.  So  generality  was  not  sacrificed. 

How  is  this  theory  to  be  validated?  To  do  this,  we  use  the  tri-step  framework  described 
in  Professor  Buchanan’s  essay  on  validating  AI  theories  and  systems  [Buc87].  The  theo¬ 
retical  and  the  analytical  steps  of  formulating  the  problem,  and  proving  that  the  proposed 
solution  will  work  have  been  carried  out.  The  empirical  validation  was  carried  out  by 
implementing  an  irrelevance  reformulator  described  in  Chapter  5.  The  implementation  of 
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has  been  tested  on  some  small  examples  in  graph  theory  and  analog  circuits.  The  power  of 
the  approach  is  demonstrated  by  the  fact  that  the  reformulator  was  able  to  automatically 
synthesize  a  new  conceptual  primitive  that  is  beyond  the  reach  of  reformulators  that  are 
cliche-based.  This  is  the  first  existence  proof  of  a  first -principles  reformulator.  The  scaling 
up  of  the  implementation  to  tackle  larger  scale  problems  will  require  conceptual  advances 
in  the  ability  to  describe  search  spaces  intensionally.  We  intend  to  apply  it  to  describe  an 
operational  amplifier  at  various  levels  of  abstraction.  We  also  wish  to  test  the  applicability 
of  the  method  to  non-discrete  domains:  in  particular,  the  discretization  of  space  to  make 
path  planning  problems  easier  to  solve. 

The  analysis  of  why  the  method  works  and  the  cases  where  it  works  is  accomplished 
theoretically  by  proving  appropriate  theorems  that  establish  the  limits  as  well  as  the 
capabilities  of  the  method.  However,  ablation  studies  have  not  been  performed.  What 
happens  to  the  method  if  any  of  the  conditions  specified  by  the  theorems  are  violated? 
This  requires  the  design  of  good  experiments  on  the  implementation;  and  much  thought 
requires  to  be  put  into  the  enterprise.  This  work  will  be  done  as  a  follow-up  validation 
project. 

The  quantitative  results  in  this  thesis  are  results  on  reduction  in  size  (expressed  in  big- 
O  terms)  of  the  search  space  by  reduction  methods.  Some  are  results  on  the  complexity 
of  verification  of  irrelevance  claims  and  the  efficiency  of  irrelevance  reasoning.  The  lack 
of  good  tools  for  measuring  the  impact  of  removal  of  irrelevant  information  affects  our 
ability  to  be  able  to  make  more  precise  claims  at  this  point.  Much  of  the  work  has  been  of 
a  descriptive  rather  than  prescriptive  in  nature;  the  inversion  of  the  normative  irrelevance 
principle  has  been  achieved  on  a  small  class  of  problems. 

Even  though  this  is  a  formal  thesis  in  the  sense  defined  in  [Buchanan87],  the  theorizing 
was  done  with  actual  data  in  mind.  Examples  of  reformulation  behaviour  that  we  wanted 
to  capture  included  all  shifts  to  formulations  of  coarser  granularity.  There  is  informal 
psychological  evidence  that  humans  are  very  good  at  this.  Unfortunately,  there  is  almost 
no  formal  psychological  data  gathered  on  reformulation  behaviour  in  humans  that  is  of  the 
kind  investigated  in  this  thesis.  Part  of  it  is  because  the  irrelevance  principle  is  compiled 
into  the  human  reasoning  system  and  we  rarely  introspect  about  it;  we  simply  ignore 
detail  that  is  irrelevant  to  the  present  goals.  However,  our  robots  are  not  endowed  with 
this  mechanism  and  this  thesis  presents  an  analysis  of  the  mechanism,  along  with  ways  of 
actually  compiling  it  into  the  behaviour  of  robots. 
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There  is  a  deep  concern  for  the  practical  tractability  of  the  reformulation  process  itself. 
Chapter  5  deals  exclusively  with  ways  of  defining  good  representations  of  conceptualiza¬ 
tions.  encodings,  search  and  proof  spaces  and  how  to  use  them  to  compile  out  special  cases 
of  irrelevance  minimization. 

6.3  Future  Research 

Future  research  plans  include  extending  the  theoretical  framework  by  investigating  a  richer 
class  of  irrelevance  claims  that  form  the  basis  for  automating  approximations  and  inductive 
reformulations.  The  experimental  component  seeks  to  test  the  theoretical  methods  in 
large  scale  problems  in  engineering  design  and  planning,  with  a  view  to  extending  the 
capabilities  of  present-day  computer-aided  design  systems. 

6.3.1  Theoretical  Issues 

1.  Extensions  to  the  Theory  of  Irrelevance 

•  Approximations:  The  choice  of  primitives  to  describe  a  problem  is  directly  re¬ 
lated  to  their  value  in  the  solution  of  the  problem.  Abstraction  reformulations 
can  be  seen  as  shifting  of  the  focus  of  attention  of  the  problem  solver  from 
the  irrelevant  aspects  of  the  problem  to  the  essential  ones.  The  only  abstrac¬ 
tions  considered  so  far  were  pure  deductive  ones:  the  reformulation  did  not 
affect  the  correctness,  only  the  efficiency  of  inference.  A  large  class  of  use¬ 
ful  abstractions  called  approximations  trade  accuracy  for  efficiency.  A  classic 
case  is  the  simplification  of  the  hybrid-pi  model  of  a  transistor  to  the  base- 
emitter  model  under  the  low  frequency  condition.  We  propose  to  automate 
their  construction  by  relaxing  our  constraints  on  the  specification  of  irrelevance 
claims  to  allow  for  approximate  correctness.  We  shall  then  develop  lemmas 
of  approximate  irrelevance  and  reduction  schemes  that  will  alter  formulations 
to  minimize  approximate  irrelevance.  Our  test  bed  will  be  problems  in  plan¬ 
ning  in  the  blocks-world  and  simplification  of  transistor  models.  The  results 
that  we  hope  to  obtain  are  general-purpose  approximation  methods  that  can 
be  used  across  domains  to  relax  models  to  incrementally  trade-off  accuracy  for 
efficiency. 
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•  Iterative  minimization  methods:  The  methods  developed  in  this  thesis  con¬ 
struct  1-level  abstractions  (of  which  FoundingFather  is  an  example).  We  would 
like  to  extend  the  methods  to  develop  abstraction  hierarchies  as  in  Abstrips. 
The  levels  would  correspond  to  gradual  ignoring  of  information.  The  basic 
method  would  be  the  ordered  reduction  of  a  formulation  by  irrelevance  claims 
as  opposed  to  the  simultaneous  reduction  used  in  Chapter  4.  The  development 
of  techniques  to  achieve  this  requires  construction  of  cost  models  that  allow 
for  the  rapid  calculation  of  the  effect  of  reduction  by  an  irrelevance  claim  to 
decide  on  a  good  ordering  for  reduction.  Our  test  bed  is  Abstrips:  we  will  start 
from  the  most  detailed  description  of  the  action  operations  and  facts  about  pre¬ 
conditions  (how  easy  they  are  to  achieve),  and  construct  a  hierarchy  of  action 
operators  that  minimizes  planning  time  for  given  classes  of  planning  goals. 

•  Extending  the  irrelevance  claims  to  specify  probabilistic  information:  many 
irrelevance  claims  in  the  world  are  of  a  probabilistic  nature.  The  claims  de¬ 
scribing  factors  that  are  irrelevant  in  a  medical  diagnosis  task  are  an  example. 
To  allow  their  specification,  we  need  to  examine  the  semantics  of  probabilistic 
irrelevance.  This  would  be  pre-cursor  to  the  development  of  methods  to  act 
upon  these  claims. 

2.  Automating  Refinement  and  Isomorphic  Reformulations 

The  reasoning  needed  to  accomplish  reformulations  consists  of  means  of  evaluating 
the  epistemic  and  computational  consequences  of  perturbing  the  conceptualization 
either  by  the  introduction  or  the  removal  of  a  conceptual  element.  Irrelevance  claims 
were  a  particularly  nice  form  of  justification  because  they  tied  the  exclusion  of  a 
conceptual  element  directly  to  its  computational  consequences.  Finding  further  jus¬ 
tifications  of  this  form  to  automate  isomorphic  and  refinement  reformulation  is  a 
logical  next  step  in  our  theoretical  investigations. 

•  Development  of  further  invariants  in  the  representation-inference  tradeoff:  The 
method  of  irrelevance  minimization  allows  for  the  introduction  of  new  con¬ 
cepts  that  simplify  inference  to  achieve  a  certain  class  of  goals.  This  is  done 
by  pruning  entire  subtrees  in  the  computation.  A  related  method  that  we 
will  investigate  is  the  substitution  of  subtrees  by  simpler  computations  and 
propagating  their  effects  into  the  vocabulary  for  the  problem.  The  concept  of 
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substitutability  will  be  formalized  with  an  eye  to  justifying  and  automating  the 
class  of  refinement  and  isomorphic  reformulations. 

•  Inductive  reformulations:  the  extension  of  these  methods  to  cover  non-deductive 
inference  is  a  project  begun  in  [RS88].  Related  work  on  this  is  done  under  the 
rubrik  of  the  study  of  bias  in  machine  learning.  Non-deductive  systems  are 
very  sensitive  to  the  form  of  the  representation  chosen  for  the  premises  of  each 
inference.  With  one  representation  the  system  may  return  the  correct  solution, 
with  another  it  may  not,  even  though  they  both  contain  the  same  information. 
Inductive  reformulations  seek  to  transform  a  given  representation  to  one  that 
maximizes  correctness  of  conclusions  with  respect  to  a  given  problem  solver. 
Our  initial  focus  for  the  study  of  this  phenomenon  will  be  reformulation  de¬ 
scriptions  of  to  make  analogy  by  similarity  (by  counting  features)  work. 

•  Symmetry  reformulations:  Symmetry  is  a  general  kind  of  redundancy.  If  a  sys¬ 
tem  possesses  symmetry,  then  the  behaviour  of  a  subpart  can  be  computed  by 
knowing  the  behaviour  of  a  symmetric  subpart  and  the  symmetry  function.  For 
redundancy,  the  symmetry  function  is  the  identity  function.  Using  the  search 
space  metaphor,  symmetry  is  said  to  exist  in  a  search  tree  if  a  subtree  can  be 
computed  from  another  subtree  plus  a  translation  function.  This  amounts  to 
reusing  old  computation  and  thus  leads  to  computational  efficiency.  Automat¬ 
ing  symmetry  reformulations  allows  us  to  generalize  the  class  of  irrelevance 
reformulations. 

3.  Improving  the  Efficiency  of  Deriving  Reformulations 

•  Reformulation  Algorithms:  An  approach  to  containing  the  complexity  of  the 
first-principles  reasoning  is  to  compile  some  of  the  reduction  inferences  into 
graph- theoretic  algorithms.  The  FoundingFather  transformation  can  be  com¬ 
piled  down  to  a  standard  union-find  algorithm.  We  will  investigate  classes  of 
irrelevance  minimization  inferences  that  can  be  subject  to  this  kind  of  compi¬ 
lation. 

•  Reduction  Lemmas:  The  complexity  of  intensional  reasoning  requires  that  we 
develop  reduction  lemmas  akin  to  the  irrelevance  lemmas  to  speed  up  the  pro¬ 
cess.  Work  on  this  as  well  as  on  better  intensional  descriptions  of  proof  and 
search  spaces  is  critical  to  making  the  approach  a  practical  one. 
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6.3.2  Experimental  Projects 

No  amount  of  theoretical  investigations  can  replace  a  practical  implementation  of  the 
ideas  being  investigated,  and  such  an  implementation  is  a  key  part  of  the  research  effort 
we  are  proposing.  The  prototype  irrelevance  reformulator  built  in  Chapter  5  automatically 
derives  the  partition  representation  of  an  equivalence  relation,  as  well  as  the  concept  of 
Thevenin  equivalents  from  Kirchoff’s  and  Ohm’s  laws.  This  is  a  capability  that  no  other 
AI  system  possesses.  This  reformulator  is  built  upon  the  meta-level  reasoning  system 
MRS  [Gen83b,Gen83a,Rus85|.  MRS  is  extensively  used  in  academic  and  industrial  envi¬ 
ronments  and  provides  its  users  with  a  variety  of  knowledge  representation  and  inference 
procedures  -  it  is  one  of  the  few  knowledge  representation  languages  in  AI  that  provide 
constructs  for  describing  representations  and  inference  methods.  This  feature  is  critical 
for  the  development  of  the  meta- theoretic  reasoning  methods  for  reformulation.  The  pro¬ 
totype  reformulator  built  in  this  thesis  uses  MRS  and  Lisp.  The  extensions  that  we  plan 
to  make  on  this  prototype  system  include: 

1.  A  Reformulation  Assistant 

The  inflexibility  of  present  day  design  environments  is  largely  due  to  their  inability  to 
reason  with  multiple  models  of  a  domain  at  different  granularity  levels.  The  theory  of 
incremental  reformulations  proposed  here  can  be  used  to  synthesize  abstractions  of 
a  detailed  model  in  a  goal-directed  manner.  We  intend  to  test  our  theory  and  extend 
the  available  set  of  reduction  lemmas  by  building  a  reformulation  assistant.  This 
is  a  system  that  accepts  irrelevance  claims  from  a  domain-expert  and  synthesizes  a 
formulation  that  doesn’t  make  the  distinctions  specified  by  the  irrelevance  claims. 

•  With  the  advances  being  made  in  the  technology  of  manufacturing  digital  de¬ 
vices  it  is  possible  to  build  systems  of  unprecedented  complexity.  Representing, 
and  reasoning  about  such  systems  requires  describing  them  at  varying  levels  of 
abstraction  to  contain  the  complexity  of  tasks  like  diagnosis  and  test  genera¬ 
tion.  This  places  a  large  burden  on  the  specifier  of  a  system.  Also  the  system 
is  constrained  by  the  fixed  abstraction  levels  provided.  The  methods  provided 
in  this  proposed  can  be  used  to  synthesize  abstraction  levels  that  are  tuned  to 
particular  task  requirements.  We  plan  to  test  this  idea  in  the  context  of  the 
Helios  design  environment  [Sin86,Gen84].  One  of  the  first  projects  will  be  to 
take  the  specification  a  full  adder  at  the  gate  level  and  derive  the  functional 


CHAPTER  6.  CONCLUSIONS  AND  FUTURE  WORK 


126 


specification  that  abstracts  details  of  the  structure  so  as  to  make  simulating 
the  circuit  extremely  efficient.  The  examples  of  structural  and  functional  ab¬ 
stractions  presented  in  Chapter  2  of  [Sin86]  will  then  be  automated:  this  will 
be  an  excellent  demonstration  of  the  power  of  automated  reformulation. 

•  The  abstractions  in  the  digital  circuits  domain  involve  moving  from  discrete  to 
more  discrete  descriptions.  To  test  the  utility  of  the  theory  in  abstracting  con¬ 
tinuous  phenomena:  we  will  explore  automatic  discretization  of  space  to  make 
motion  planning  efficient  in  collaboration  with  Bruce  Donald  at  the  Computer 
Science  Department  at  Cornell  University.  A  first  step  is  the  expression  of 
discretization  criteria  [LP83,Bro83,Don87]  as  irrelevance  claims  and  the  devel¬ 
opment  of  special-purpose  reduction  methods  that  generate  tilings  of  a  given 
region  in  3-space. 

2.  Combining  First-Principles  and  Cliched  Reformulations 

Our  prototype  reformulator  works  from  first  principles.  The  meta- theoretic  reason¬ 
ing  required  is  expensive:  we  therefore  wish  to  explore  an  architecture  that  allows 
integration  of  the  use  of  previously  derived  reformulations  (cliched  reformulations) 
with  the  ability  to  synthesize  new  ones  on  demand.  This  requires  generalizing  a 
newly  derived  reformulation  so  as  to  increase  its  range  of  application.  For  instance, 
the  FoundingFather  reformulation  can  be  generalized  to  be  a  useful  reformulation  for 
the  computation  of  any  equivalence  relation  defined  in  terms  of  a  partially  ordered  re¬ 
lation.  Generalization  of  this  kind  can  be  accomplished  by  using  the  well-established 
method  of  explanation-based  generalization  [MTMS86].  The  novel  aspect  of  this  use 
of  EBG  is  that  the  explanations  formed  and  generalized  are  meta- theoretical. 
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